Lucene indexing and searching for multiple languages | Community
Skip to main content
July 15, 2016

Lucene indexing and searching for multiple languages

  • July 15, 2016
  • 1 reply
  • 1602 views

Hi all,

I'm managing a multicountry and multilanguage site on CQ 5..6.1 version.

I would like to use advanced  search functionalities for all languages in the site (english, italian, spanish, french, german, etc..)

First of all, I would to use stemming.

By default, this is enabled only for english language and it doesn't work with the other languages.

I know that Lucene can be configured through the repository.xml, in the tag <SearchIndex class="com.day.crx.query.lucene.LuceneHandler">

But, for example, only one analyzer can be set at a time.

Is there a way to configure multiple analyzers, synosym providers, tokenizers etc.. according to the path in the repository? That is, for the italian tree of the site Lucene uses italian analyzer,tokenizer, etc.. both to index and to search, meanwhile for the english tree Lucenes uses english  analyzer,tokenizer, etc.. both to index and to search and so on.

Thanks

Aldo

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

1 reply

smacdonald2008
Level 10
July 18, 2016

Here is a GEMS session on Lucene Indexing:

http://scottwestover.blogspot.ca/2016/01/aem-gems-oak-lucene-indexes.html

As far as the other question: 

"Is there a way to configure multiple analyzers, synosym providers, tokenizers etc.. according to the path in the repository?"

I am not clear on what you want to do. 

July 18, 2016

Hi smacdonald2008

I think OAK indexes have been introduced only with aem 6, but I'm workink on CQ 5.6.1.

I would like to know if I can use stemming for example for english and italian languages according to the site in which the search is executed.

That is if I search in the path /chicco/site/it/ the stemming should work for italian language, meanwhile if I search in /content/site/en the stemming should work with english language.

 

Aldo