Adobe Experience Manager Sites & More

Report · 10/15/15

Hi ,
We have around half million assets in our repository, because of this the index size is huge. I have tried various techniques to reduce index size like purging workflow instance, audit etc.
Also tried almost every thing by following every step of this doc .http://www.wemblog.com/2012/01/how-to-reduce-lucene-index-size-in-cq.html

The only thing that i am missing is to configures list of stop words as it appears that by default CQ does FullText indexing of every word causing high index size.
While digging deep i found various stop words list under lucene-analyzers-3.6.0\org\apache\lucene\analysis structure.

However not sure whether it is used or not. How can we configured stop words setting for searchindex in CQ 5.5/CQ 5.6 .

NB: I am not using SOLR.

Thanks
Shishir Srivastava

Sham_HC · 10/15/15

AFAIK CQ 5.5 & above uses the com.day.crx.query.lucene.LuceneHandler for the SearchIndex by default which does not have stopword filtering. Check your repository.xml, workspace.xml if lucence is configured for any custom handler apart from OOB. If so check your implementation. If you are using OOB configuration what made you to come to conclusion index space used by stop words setting

View solution in original post

Sham_HC · 10/15/15

AFAIK CQ 5.5 & above uses the com.day.crx.query.lucene.LuceneHandler for the SearchIndex by default which does not have stopword filtering. Check your repository.xml, workspace.xml if lucence is configured for any custom handler apart from OOB. If so check your implementation. If you are using OOB configuration what made you to come to conclusion index space used by stop words setting

Yogesh_Upadhyay · 10/15/15

Hello Shishir,

I am not sure if you checked tika config file before using it. There was some syntax error in that. I just fixed it and attached it again. Also make sure that you disable supportHighlight feature. I have also attached indexing_config file (With some more node type included) that you can use.

Yogesh

Adobe Experience Manager Sites & More

how to configure stop words in searchindex to reduce index size

Learn

Documentation

Community

Support

Resources

Adobe account

Adobe