Expand my Community achievements bar.

Lucene indexing and searching for multiple languages

Avatar

Level 1

Hi all,

I'm managing a multicountry and multilanguage site on CQ 5..6.1 version.

I would like to use advanced  search functionalities for all languages in the site (english, italian, spanish, french, german, etc..)

First of all, I would to use stemming.

By default, this is enabled only for english language and it doesn't work with the other languages.

I know that Lucene can be configured through the repository.xml, in the tag <SearchIndex class="com.day.crx.query.lucene.LuceneHandler">

But, for example, only one analyzer can be set at a time.

Is there a way to configure multiple analyzers, synosym providers, tokenizers etc.. according to the path in the repository? That is, for the italian tree of the site Lucene uses italian analyzer,tokenizer, etc.. both to index and to search, meanwhile for the english tree Lucenes uses english  analyzer,tokenizer, etc.. both to index and to search and so on.

Thanks

Aldo

2 Replies

Avatar

Level 10

Here is a GEMS session on Lucene Indexing:

http://scottwestover.blogspot.ca/2016/01/aem-gems-oak-lucene-indexes.html

As far as the other question: 

"Is there a way to configure multiple analyzers, synosym providers, tokenizers etc.. according to the path in the repository?"

I am not clear on what you want to do. 

Avatar

Level 1

Hi smacdonald2008

I think OAK indexes have been introduced only with aem 6, but I'm workink on CQ 5.6.1.

I would like to know if I can use stemming for example for english and italian languages according to the site in which the search is executed.

That is if I search in the path /chicco/site/it/ the stemming should work for italian language, meanwhile if I search in /content/site/en the stemming should work with english language.

 

Aldo