I have a requirement where AEM List component should return results based on Tag query with multiple tags using "any tag" search mode. It works just fine with 1-2 tags, but with eh tags amount increase (5-6 is normal in our circumstances) it sometimes take several minutes to return results.
Our content is fully covered by indexes, including tags properties. Logs do not show any traversal queries so I'm wondering what might be the issue.
JCR query troubleshooting document mentions that for "Queries With Many OR or UNION Conditions", which should be the case here, are very expensive and their recommendation is to use aggregated indexes instead of simple property based.
I assume that under the hood com.day.cq.wcm.foundation.List class is using TagManager.find(String basePath, String tagIDs, boolean oneMatchIsEnough) method, so I'm wondering what is the query format in use by that method and how we can improve performance for queries explained above?
Did you get a chance to check the query explain plan (dump your query in logs and check)? You can validate if your query is not picking up your custom created indexes or you need to optimize the indexes - http://localhost:4502/libs/granite/operations/content/diagnosistools/queryPerformance.html
Tools > Operations > Diagnosis > QueryPerformance
Alternatively, you could also check the slow running queries on the server via - http://localhost:4502/libs/granite/operations/content/diagnosistools/queryPerformance.html > Slow queries
The default/global index for tags would be /oak:index/cqTagLucene unless you have cq:Tag node in your custom index(es) but your query may use multiple indexes as it find appropriate which would be reflected in the explain plan.
The point is there is no "my query", I'm actually trying to figure out what that query is since sources of TagManager class implementation are not freely available and standard query debugging tools (logging querybuilder component) do not provide any insight
Have you tried placing some custom indexes for this type of query.
Once again, it's not an isolated query, we are using standard AEM List component so I just don;t know how that query looks like in TagManager implementation and that is exactly what i'm trying figure out in order to be able to tweak indexes appropriately
Create a logger for two packages via /system/console/configMgr:
com.day.cq.search & com.day.cq.tagging
Use debug/trace mode
Now, when you open any page, you would find the queries and bunch of related information. That's your starting point to debug further.
Tracing com.day.cq.tagging made sense, after digging through a number of messages I found one from com.day.cq.tagging.impl.JcrTagManagerImpl which actually revealed underlying query. For the record, it was:
/jcr:root/content/xyz//element(*, cq:Taggable)[ (@cq:tags = 'newsroom:event' or @cq:tags = '/etc/tags/newsroom/event') or (@cq:tags = 'newsroom:podcast' or @cq:tags = '/etc/tags/newsroom/podcast') or (@cq:tags = 'newsroom:Blog' or @cq:tags = '/etc/tags/newsroom/Blog' or @cq:tags = 'newsroom:Blog/blogseries' or @cq:tags = '/etc/tags/newsroom/Blog/blogseries') or (@cq:tags = 'newsroom:languages' or @cq:tags = '/etc/tags/newsroom/languages' or @cq:tags = 'newsroom:languages/arabic' or @cq:tags = '/etc/tags/newsroom/languages/arabic' or @cq:tags = 'newsroom:languages/english' or @cq:tags = '/etc/tags/newsroom/languages/english' or @cq:tags = 'newsroom:languages/french' or @cq:tags = '/etc/tags/newsroom/languages/french' or @cq:tags = 'newsroom:languages/russian' or @cq:tags = '/etc/tags/newsroom/languages/russian' or @cq:tags = 'newsroom:languages/spanish' or @cq:tags = '/etc/tags/newsroom/languages/spanish') or (@cq:tags = 'newsroom:featured' or @cq:tags = '/etc/tags/newsroom/featured') or (@cq:tags = 'newsroom:publications' or @cq:tags = '/etc/tags/newsroom/publications') or (@cq:tags = 'newsroom:news' or @cq:tags = '/etc/tags/newsroom/news') or (@cq:tags = 'newsroom:multimedia' or @cq:tags = '/etc/tags/newsroom/multimedia') or (@cq:tags = 'newsroom:success stories' or @cq:tags = '/etc/tags/newsroom/success stories') or (@cq:tags = 'newsroom:speeches' or @cq:tags = '/etc/tags/newsroom/speeches') or (@cq:tags = 'newsroom:in the news' or @cq:tags = '/etc/tags/newsroom/in the news') or (@cq:tags = 'newsroom:StaffNotes' or @cq:tags = '/etc/tags/newsroom/StaffNotes') or (@cq:tags = 'newsroom:projects' or @cq:tags = '/etc/tags/newsroom/projects' or @cq:tags = 'newsroom:projects/closed-projects' or @cq:tags = '/etc/tags/newsroom/projects/closed-projects' or @cq:tags = 'newsroom:projects/on-going-projects' or @cq:tags = '/etc/tags/newsroom/projects/on-going-projects') or (@cq:tags = 'newsroom:Photoessay' or @cq:tags = '/etc/tags/newsroom/Photoessay') or (@cq:tags = 'newsroom:for the record' or @cq:tags = '/etc/tags/newsroom/for the record') or (@cq:tags = 'newsroom:press release' or @cq:tags = '/etc/tags/newsroom/press release') ] order by @jcr:score descending
As you can see TagManager originates OR clause for every Tag and its respective children and grandchildren, no wonder it takes so long to execute...