Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.
SOLVED

Issue with an oak index using snonym filter

Avatar

Level 2

Hey guys,

I have a custom index which is working totally fine if i do not add any analyzers to it. I added an analyzer for adding a synonym to return similar results for "fact sheets" and factsheets. Index stopped working after that.

Any help with it is appreciated.

Thank you!

Here is the index

<?xml version="1.0" encoding="UTF-8"?>

<jcr:root xmlns:oak="http://jackrabbit.apache.org/oak/ns/1.0" xmlns:cq="http://www.day.com/jcr/cq/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0"

          jcr:primaryType="oak:Unstructured"

          async="async"

          compatVersion="{Long}2"

          evaluatePathRestrictions="{Boolean}true"

          reindex="{Boolean}true"

          type="lucene">

    <indexRules jcr:primaryType="nt:unstructured">

        <nt:base

            jcr:primaryType="nt:unstructured"

            includePropertyTypes="all">

            <properties jcr:primaryType="nt:unstructured">

                <literatureTitle name="literatureTitle"

                    analyzed="{Boolean}true"

                    ordered="{Boolean}true"

                    jcr:primaryType="nt:unstructured"/>

                <displayContentTypename name="displayContentTypename"

                    analyzed="{Boolean}true"

                    jcr:primaryType="nt:unstructured"/>

            </properties>

        </nt:base>

    </indexRules>

    <analyzers jcr:primaryType="nt:unstructured">

        <default jcr:primaryType="nt:unstructured">

            <filters jcr:primaryType="nt:unstructured">

                <LowerCase jcr:primaryType="nt:unstructured"/>

                <Synonym jcr:primaryType="nt:unstructured"

                synonyms="synonym.txt">

                    <synonym.txt/>

                </Synonym>

            </filters>

        </default>

    </analyzers>

</jcr:root>

and in synonym.txt i have

fact sheets, factsheets

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

As I read the docs [1], you should have the synonyms.txt as nt:file in the "Synonym" node; and is there a "classicTokenizerFactory"? I would try with "name" :"Classic" (uppercase).

Jörg

[1] Jackrabbit Oak – Lucene Index

View solution in original post

14 Replies

Avatar

Level 10

Have you read somewhere that this is supported? I am checking internally.

Avatar

Level 2

i created that using this as an example. AEM Search Indexing: Synonyms, Filters, and Stop Words (oh my!) | HS2 Solutions

Is there any way i cann achieve this where factsheet and fact sheets returns the same results.

Avatar

Level 10

Thanks for the information. In your example - have you followed all the information given here - Understanding Analyzers, Tokenizers, and Filters | Apache Solr Reference Guide 6.6

Avatar

Level 2

we are not using Solr index. just the lucene full text index. And the index is in the same format as the example. But, i am not sure why its not working

Thanks!

Avatar

Employee Advisor

What do you mean with "it stopped working"? Any exceptions? What change in behaviour did you find when you changed the index definition?

Avatar

Level 2

when i try to reindex it, it just stays true and doesnt change to false and i dont see anything in the logs either.

Avatar

Employee Advisor

Ok, so if you add these additional settings and you try to reindex, the reindexing is not starting? Just from looking at the index definition I would assume that the nodetype of the index definition itself (/oak:index/customIndex) is wrong; it should not be "nt:unstructured" but rather "oak:queryIndexDefinition".

Jörg

Avatar

Level 2

If i use the oak:queryIndexDefinition, it gives me a "javax.jcr.nodetype.ConstraintViolationException: OakConstraint0001: The primary type null does not exist (500)" error. I am not sure what the issue is.

Avatar

Level 2

Nvm i had to delete the existing index for it to take the new node type. I will try adding the synonym now and see if it works and let you know.

Thanks!

Avatar

Level 2

After i changed the node type, i had to add a tokenizer for the index to be reindexes.

<analyzers jcr:primaryType="nt:unstructured">

        <default

            jcr:primaryType="nt:unstructured">

            <filters jcr:primaryType="nt:unstructured">

                <Synonym jcr:primaryType="nt:unstructured"

                    synonyms="synonym.txt">

                    <synonym.txt/>

                </Synonym>

            </filters>

            <tokenizer jcr:primaryType="nt:unstructured"

                       name="classic"/>

        </default>

    </analyzers>

now i can re index but i am not getting any results back using this index.

Avatar

Employee Advisor

Can you please provide your complete index definition (e.g. as JSON dump)? It's hard to guess just from this snippet what could be wrong.

Avatar

Level 2

Here is the JSON of the index

"faiLiteratureIndex":{

   "jcr:primaryType":"oak:QueryIndexDefinition",

   "compatVersion":2,

   "type":"lucene",

   "async":"async",

   "evaluatePathRestrictions":true,

   "reindex":true,

   "indexRules":{

      "jcr:primaryType":"nt:unstructured",

      "nt:base":{

         "jcr:primaryType":"nt:unstructured",

         "includePropertyTypes":"all",

         "properties":{

            "jcr:primaryType":"nt:unstructured",

            "literatureTitle":{

               "jcr:primaryType":"nt:unstructured",

               "ordered":true,

               "analyzed":true,

               "name":"literatureTitle"

            },

            "displayContentTypename":{

               "jcr:primaryType":"nt:unstructured",

               "analyzed":true,

               "name":"displayContentTypename"

            }

         }

      }

   },

   "analyzers":{

      "jcr:primaryType":"nt:unstructured",

      "default":{

         "jcr:primaryType":"nt:unstructured",

         "filters":{

            "jcr:primaryType":"nt:unstructured",

            "Synonym":{

               "jcr:primaryType":"nt:unstructured",

               "synonyms":"synonym.txt"

            }

         },

         "tokenizer":{

            "jcr:primaryType":"nt:unstructured",

            "name":"classic"

         }

      }

   }

}

Avatar

Correct answer by
Employee Advisor

As I read the docs [1], you should have the synonyms.txt as nt:file in the "Synonym" node; and is there a "classicTokenizerFactory"? I would try with "name" :"Classic" (uppercase).

Jörg

[1] Jackrabbit Oak – Lucene Index

Avatar

Level 2

I am able to get the results back. I also had to add lowercase filter for it to work. Thank you so much Jörg Hoh​.