AEM 6.2 - Oak Lucene Indexes - Configuring Composing Analyzer




I'm looking for a way to have a configuration using Composing Analyzer in order to have a different filters between the analyzer used for the index and the query.

On the Oak Lucene Documentation ( Jackrabbit Oak – Lucene Index ​) i see that we can configure default or pathText analyzers and maybe others due to the ... but i can't find the documentation related to the exhaustive list of the analyzers we can use.

Oak Lucene documentation

+ sampleIndex
- jcr:primaryType = "oak:QueryIndexDefinition"
+ analyzers
+ default
+ pathText

They seems also to be different than the one in the Apache Solr documentation ( Analyzers - Apache Solr Reference Guide - Apache Software Foundation  😞 index and query

<analyzer type="index">

    <tokenizer class="solr.StandardTokenizerFactory"/>

    <filter class="solr.LowerCaseFilterFactory"/>

    <filter class="solr.KeepWordFilterFactory" words="keepwords.txt"/>

    <filter class="solr.SynonymFilterFactory" synonyms="syns.txt"/>


<analyzer type="query">

    <tokenizer class="solr.StandardTokenizerFactory"/>

    <filter class="solr.LowerCaseFilterFactory"/>


Does anyone know where i could find the documentation on the list of configurable analyzers for Oak Lucene index or how can i configure a different analyzer for query and for index ?

Best regards,

Maxime Nougarede


Accepted Solutions (0)

Answers (4)

Answers (4)




Dear Maxim,

Thank you for asking such interesting question indeed,

Looking at:


We can see that the assembly is happening in the:

private static Map<String, Analyzer> collectAnalyzers(NodeState defn) {

  Map<String, Analyzer> analyzerMap = newHashMap();

  NodeStateAnalyzerFactory factory = new NodeStateAnalyzerFactory(LuceneIndexConstants.VERSION);

  NodeState analyzersTree = defn.getChildNode(LuceneIndexConstants.ANALYZERS);

   for (ChildNodeEntry cne : analyzersTree.getChildNodeEntries()) {

  Analyzer a = factory.createInstance(cne.getNodeState());

  analyzerMap.put(cne.getName(), a);


   if (getOptionalValue(analyzersTree, INDEX_ORIGINAL_TERM, false) && !analyzerMap.containsKey(ANL_DEFAULT)) {

  analyzerMap.put(ANL_DEFAULT, new OakAnalyzer(VERSION, true));


   return ImmutableMap.copyOf(analyzerMap);


Which then are set during the IndexDefinition build to variable:

this.analyzers = collectAnalyzers(defn);

Which then are used in the following area:

if (analyzers.containsKey(LuceneIndexConstants.ANL_DEFAULT)){

  defaultAnalyzer = analyzers.get(LuceneIndexConstants.ANL_DEFAULT);


So, to answer your question. You can define as many analysers as you want, but as per current code base only the 'default' analyser will be used.






Thanks a lot for your help, meanwhile i found a part of my answer in the Jackrabbit Oak – Lucene Index documentation where they wrote:

@Note that currently only one analyzer can be configured per index. Its not possible to specify separate analyzer for query and index time currently.

But i still want to understand difference between default and pathText .