Hello
I'm implementing a search feature in CQ5.5. A requirement for search is that 'stemming' be used - search for 'builder' and find 'building', 'builds' etc.
This functionality can be had using, for example, org.apache.lucene.analysis.en.EnglishAnalyzer. However, the Lucene that comes with CQ5.5 is v303. Its available analyzers are:
$ find . -name "lucene-core*.jar" | xargs jar tf | grep "analysis/.*Analyzer"
org/apache/lucene/analysis/Analyzer.class
org/apache/lucene/analysis/KeywordAnalyzer.class
org/apache/lucene/analysis/PerFieldAnalyzerWrapper.class
org/apache/lucene/analysis/SimpleAnalyzer.class
org/apache/lucene/analysis/StopAnalyzer$1.class
org/apache/lucene/analysis/StopAnalyzer$SavedStreams.class
org/apache/lucene/analysis/StopAnalyzer.class
org/apache/lucene/analysis/WhitespaceAnalyzer.class
org/apache/lucene/analysis/standard/StandardAnalyzer$1.class
org/apache/lucene/analysis/standard/StandardAnalyzer$SavedStreams.class
org/apache/lucene/analysis/standard/StandardAnalyzer.class
Non of these support stemming. The earliest version where EnglishAnalyzer is available is 3.2. Is there a way to update the Lucene on an existing installation?
Alternatively, in next versions of Lucene, the analyzers live in a jar of their own - analyzers-common (http://lucene.apache.org/core/4_0_0/analyzers-common/overview-summary.html)
How would I expose these analyzers to the org.apache.jackrabbit.core.query.lucene.SearchIndex, the class that reads the SearchIndex tag in workspace.xml?
Thanks,
Eli
Solved! Go to Solution.
Views
Replies
Total Likes
You can write own EnglishAnalyzer then provide that through fragment bundle for the embedded repository bundle (com.day.crx.sling.server). Refer an sample for lucene excerpt at [1] on similar lines you can implement here.
[1] http://aemfaq.blogspot.com/2013/09/how-to-override-lucene-excerpt-provider.html
Views
Replies
Total Likes
You can create a Java OSGi fragment bundle that contains the updated Java classes that contains the supported functionality that you want to use. One of the most powerful aspects of AEM is if it does not have OOTB functionality that you need - you can create your own OSGi bundles that contain Java that you need and you can write custom front end components that call the backend service.
Views
Replies
Total Likes
Thank for the reply! So, suppose I wrap the lucene-core jar with OSGi metadata. This fragment would then, supposedly, export org.apache.lucene.analysis.*. Will org.apache.jackrabbit.core.query.lucene.SearchIndex become aware of the new analyzers?
Views
Replies
Total Likes
You can write own EnglishAnalyzer then provide that through fragment bundle for the embedded repository bundle (com.day.crx.sling.server). Refer an sample for lucene excerpt at [1] on similar lines you can implement here.
[1] http://aemfaq.blogspot.com/2013/09/how-to-override-lucene-excerpt-provider.html
Views
Replies
Total Likes
Hi,
I am trying to add a analyzer from lucene-analyzer-2.4.1.jar into the search-index tag of workspace.xml. As this jar is not available in OOB CQ 5.4 version, I can add this as a dependency to my custom OSGI bundle.
But, is it possible that CQ finds this jar while indexing?
Views
Replies
Total Likes