Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.

AEM 6.0/oak-1.0.22 indexing issue

Avatar

Level 7

Hi all,

 

We have an issue for the AEM6.0/oak-1.0.22 indexing issue. There a lot of warning messages like:

org.apache.jackrabbit.oak.plugins.index.property.strategy.ContentMirrorStoreStrategy Traversed 80000 nodes (280221 index entries) using index cq:tags with filter Filter(query=select [jcr:path], [jcr:score], * from [cq:PageContent] as a where [sling:resourceType] = 'site/components/pages/bio' and [cq:tags] = 'departments/film-tvs' and [firstName] <> '' and isdescendantnode(a, '/content/site/directory') order by [lastName], [firstName] /* xpath: /jcr:root/content/site/directory//element(*, cq:PageContent)[@sling:resourceType='site/components/pages/bio' and @cQ:tags='departments/film-tvs' and @FirstName <> '' ] order by @lastName, @FirstName */ ...

 

It doesn't tell it needs an index, but it seems the index doesn't work. 

Screen Shot 2022-12-22 at 9.56.24 AM.png

 

It uses the OOTB cqTags index as shown. When I run the query on Tools > query console, I see a lot of warnings. So I create my own as:

 

Screen Shot 2022-12-22 at 9.55.39 AM.png

 

I still get a lot of warnings, so I delete OOTB cqTags. Now I don't see the warnings in the log. But in either way, AEM service consumes a lot of resources and can bring down the server within a short period of time. 

Also, why AEM has to travels 80000 nodes even if the path is "/jcr:root/content/site/directory//element(*, cq:PageContent)"? 

 

Any help will be appreciated!

 

-kt

 

10 Replies

Avatar

Community Advisor

Hi @kevingtan 

 

Can you check if your content path (on which the query is being executed) is big? Generally, this happens if you give a generic path (like /content) where you would be having the huge structure below.

 

Similar issue: https://experienceleaguecommunities.adobe.com/t5/adobe-experience-manager/the-query-read-or-traverse...

 

Hope this helps!

 

Thanks,

Kiran Vedantam

Avatar

Level 7

Hi Kiran,

 

No the path is very specific. I actually did a very simple query as:

/jcr:root/content/site/departments/film-tvs/courses/2022-fall//element(*, cq:PageContent)

 

It traverses 80000 nodes even if there are less than 100 pages under that directory. The similar issue you are referring is a higher version. AEM 6.0 has limited tools to explain the queries.

Avatar

Community Advisor

@kevingtan 

 

The query is made for type=cq:PageContent. Thus, if indexes are adequate, it should ideally pick index specific for cq:PageContent. Thus, adding cq:Tags to cq:PageContent index might not help.

 

I would suggest following steps:

  • Remove cq:Tags part from the query and then execute. Check if the index specific to "cq:PageContent" is picked.
    • If yes, it appears the cq:tags index seem to have lower cost than cq:PageContent in this case.
    • If no, You might have to improve the "cq:PageContent" index
  • If the query is always going to be under a specific content path, consider creating a smaller index. Use evaluatePathRestrictions , excludePath, includePath, queryPaths properties to fine tune the indexes.

 

https://jackrabbit.apache.org/oak/docs/query/query-engine.html


Aanchal Sikka

Avatar

Level 7

Hi @aanchal-sikka,

 

It looks like oak doesn't know how to limit the query size for some reason. My understanding is that if a query limits to a certain path, say, /jcr:root/a/b/c, the space complexity shouldn't be more than the total nodes under that path. I issue a query to a certain path with only less than 10 nodes underneath, it gives a warning that "Traversed 10000 nodes..." with a "@cq:tags" condition. It won't without the condition, somehow oak has a problem with that indexing. The query doesn't have problems with other indexing keys, such as start_date, searchScope, etc. Probably because it has a conflict with the OOTB oak:index/cqTags. 

Avatar

Employee Advisor

Please please please update to more recent version. AEM 6.0 is out of support for quite some time, and AEM 6.5 has fixed many of these issues you are facing with Oak 1.0.x.)

 

(And even if you cannot update immediately: There are much newer Oak versions available for it. I know of at least Oak 1.0.40.)

Avatar

Level 7

Yes, we are in the process of updating, but something needs an immediate fix.

Avatar

Employee Advisor

The message above is just a warning. If reaching 100k nodes it will terminate the query. You can increase that limit (see https://jackrabbit.apache.org/oak/docs/query/query-engine.html#slow-queries-and-read-limits), but I would recommend to stick with the default.

Avatar

Level 7

That's correct, but it will consume a lot of cpu power if the site gets hit at a certain level. Plus, when it drops the query at the limit, say, 100k nodes, the result set is not always accurate because the more precise records may be left behind.

Avatar

Employee Advisor

Yes, that's the drawback of the situation. Your index has approx 280k entries, so you might need to go through all of them. The best way would be a custom index, but I know that in later versions a lot of indexes have been adjusted to avoid issues like this. So I am not sure if you want to introduce a custom index right now. Also your old Oak version probably lacks many of the features which would be required to make the recommendations made for newer AEM versions work.

(There is not much experience out there to tune indexes on AEM 6.0, because at that time a lot of the indexing was still in flux.)

Avatar

Level 7

I made our own index as shown in the screenshot. A lucene-type index with many other indexing keys also. The issue is that when using cq:tags, oak-1.0.22 always picks up the OOTB one. I tried to make an alternative by getting rid of the OOTB one, which is always a bad idea, it turned out that there is an AEM process running at 12:00pm everyday to collect the indexing garbage. With missing OOTB index, it even traverses more nodes when trying to collect the garbage. Right now, we have to give up some features to avoid triggering the resource-intensive query. Luckily, we will update our AEM to 6.5 the beginning of 2023. 

 

Thanks for your help!