Expand my Community achievements bar.

SOLVED

Query optimization using custom index

Avatar

Level 1

Hi,
I am running a custom query (plz see below) to fetch around 2million records(from past 10 years) from DAM, but due to extensive traversing thru nodes it is reaching threshold limit and throwing an exception. I tried to create custom lucene index to optimize the query result but no luck and getting the same exception. Any pointers to achive this would be helpful if anyone has implemented it before.
query in Xpath

/jcr:root/content/dam/abc/product//element(*, dam:Asset)
[
((jcr:content/@jcr:lastModified > xs:dateTime('2010-03-14T00:00:00.000Z')
and jcr:content/@jcr:lastModified < xs:dateTime('2022-06-23T00:00:00.000Z')))
]

 

@arunpatidar26 @shawn 

1 Accepted Solution

Avatar

Correct answer by
Employee

Hi @suyog,

Can you try creating the custom index with below definition:

- compatVersion = 2
- async = "async"
- jcr:primaryType = oak:QueryIndexDefinition
- evaluatePathRestrictions = true
- type = "lucene"
+ indexRules
+ dam:Asset
+ properties
+ lastModified
- name = "jcr:content/jcr:lastModified"
- propertyIndex = true

 

You can also tune the threshold for node traversal via the OSGi QueryEngineSettings.

More information available under : https://jackrabbit.apache.org/oak/docs/query/query-engine.html#Slow_Queries_and_Read_Limits

Thanks!!

 

View solution in original post

2 Replies

Avatar

Correct answer by
Employee

Hi @suyog,

Can you try creating the custom index with below definition:

- compatVersion = 2
- async = "async"
- jcr:primaryType = oak:QueryIndexDefinition
- evaluatePathRestrictions = true
- type = "lucene"
+ indexRules
+ dam:Asset
+ properties
+ lastModified
- name = "jcr:content/jcr:lastModified"
- propertyIndex = true

 

You can also tune the threshold for node traversal via the OSGi QueryEngineSettings.

More information available under : https://jackrabbit.apache.org/oak/docs/query/query-engine.html#Slow_Queries_and_Read_Limits

Thanks!!

 

Avatar

Level 1

Hi, 

We try to implement the above solution given but still got the traversal error as it is reaching the threshold of 100000 nodes and aborting the query. Can you please suggest anything for this.

Thanks!!