In our application, we have multiple market search, both pages and dam. Below is our basic query. Search works fine for most markets, but for China market, seems like we are getting lot of irrelevant results. Not able to find how Lucene search is working for non-English languages. Are there any configurations required for this? Also, do we have any specific Analyzer configuration for Chinese language?
fulltext=insurance
1_group.p.or=true
1_group.1_group.path=/content/product/us
1_group.2_group.path=/content/dam/product/us
2_group.p.or=true
2_group.1_group.type=cq:Page
2_group.2_group.type=dam:Asset
p.excerpt=true
AEM version: 6.2.0, SP1 - CFP8
Oak version: Apache Jackrabbit Oak 1.4.17
Thanks,
Vazahat Fatima P
Solved! Go to Solution.
Views
Replies
Total Likes
Dear Vazahat,
Normally, AEM tries to index for English Language, Lucene by standard also has everything configured for english language, indexes are also setup to follow English semantics.
AEM/OAK/Lucene/Java does not do any magic, it only crunches your data into numbers(hashes/hello inverted index), compares numbers of the matches and shows you them in the certain order. When you get irrelevant results it means that your indexes cotain irrelevant data. Therefore you need to correct:
a) How the data get's into your indexes
b) How you retrieve data from your indexes
It's fairly hard to get this 'right' just with plain Oak-Lucene integration.[0]
Please consider using Oak Solr extension[1] that provide support for Chinese language and human readable format of configuration.
Also, can recommend recent book on Relevancy by Doug[2]
[0] Issue with an oak index using snonym filter
[1] Language Analysis | Apache Solr Reference Guide 6.6
Regards,
Peter
Views
Replies
Total Likes
Dear Vazahat,
Normally, AEM tries to index for English Language, Lucene by standard also has everything configured for english language, indexes are also setup to follow English semantics.
AEM/OAK/Lucene/Java does not do any magic, it only crunches your data into numbers(hashes/hello inverted index), compares numbers of the matches and shows you them in the certain order. When you get irrelevant results it means that your indexes cotain irrelevant data. Therefore you need to correct:
a) How the data get's into your indexes
b) How you retrieve data from your indexes
It's fairly hard to get this 'right' just with plain Oak-Lucene integration.[0]
Please consider using Oak Solr extension[1] that provide support for Chinese language and human readable format of configuration.
Also, can recommend recent book on Relevancy by Doug[2]
[0] Issue with an oak index using snonym filter
[1] Language Analysis | Apache Solr Reference Guide 6.6
Regards,
Peter
Views
Replies
Total Likes
Views
Likes
Replies