Update Lucene analyzers in CQ5.5 | Community
Skip to main content
Level 2
October 16, 2015
Solved

Update Lucene analyzers in CQ5.5

  • October 16, 2015
  • 4 replies
  • 2467 views

Hello

I'm implementing a search feature in CQ5.5. A requirement for search is that 'stemming' be used - search for 'builder' and find 'building', 'builds' etc.

This functionality can be had using, for example, org.apache.lucene.analysis.en.EnglishAnalyzer. However, the Lucene that comes with CQ5.5 is v303. Its available analyzers are:

$  find . -name "lucene-core*.jar" | xargs jar tf | grep "analysis/.*Analyzer"
org/apache/lucene/analysis/Analyzer.class
org/apache/lucene/analysis/KeywordAnalyzer.class
org/apache/lucene/analysis/PerFieldAnalyzerWrapper.class
org/apache/lucene/analysis/SimpleAnalyzer.class
org/apache/lucene/analysis/StopAnalyzer$1.class
org/apache/lucene/analysis/StopAnalyzer$SavedStreams.class
org/apache/lucene/analysis/StopAnalyzer.class
org/apache/lucene/analysis/WhitespaceAnalyzer.class
org/apache/lucene/analysis/standard/StandardAnalyzer$1.class
org/apache/lucene/analysis/standard/StandardAnalyzer$SavedStreams.class
org/apache/lucene/analysis/standard/StandardAnalyzer.class

Non of these support stemming. The earliest version where EnglishAnalyzer is available is 3.2. Is there a way to update the Lucene on an existing installation?

Alternatively, in next versions of Lucene, the analyzers live in a jar of their own - analyzers-common (http://lucene.apache.org/core/4_0_0/analyzers-common/overview-summary.html)

How would I expose these analyzers to the org.apache.jackrabbit.core.query.lucene.SearchIndex, the class that reads the SearchIndex tag in workspace.xml?

 

Thanks,

Eli

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Sham_HC
You can write own EnglishAnalyzer then provide that through fragment bundle for the embedded repository bundle (com.day.crx.sling.server). Refer an sample for lucene excerpt at [1] on similar lines you can implement here.

[1]  http://aemfaq.blogspot.com/2013/09/how-to-override-lucene-excerpt-provider.html

4 replies

smacdonald2008
Level 10
October 16, 2015

You can create a Java OSGi fragment bundle that contains the updated Java classes that contains the supported functionality that you want to use. One of the most powerful aspects of AEM is if it does not have OOTB functionality that you need - you can create your own OSGi bundles that contain Java that you need and you can write custom front end components that call the backend service. 

Level 2
October 16, 2015

Thank for the reply! So, suppose I wrap the lucene-core jar with OSGi metadata. This fragment would then, supposedly, export org.apache.lucene.analysis.*. Will org.apache.jackrabbit.core.query.lucene.SearchIndex become aware of the new analyzers?

Sham_HC
Sham_HCAccepted solution
Level 10
October 16, 2015
You can write own EnglishAnalyzer then provide that through fragment bundle for the embedded repository bundle (com.day.crx.sling.server). Refer an sample for lucene excerpt at [1] on similar lines you can implement here.

[1]  http://aemfaq.blogspot.com/2013/09/how-to-override-lucene-excerpt-provider.html

October 16, 2015

Hi,

I am trying to add a analyzer from lucene-analyzer-2.4.1.jar into the search-index tag of workspace.xml. As this jar is not available in OOB CQ 5.4 version, I can add this as a dependency to my custom OSGI bundle.

But, is it possible that CQ finds this jar while indexing?