Adobe Experience Manager Sites & More

bartek__w · 7/8/20

Hi all,

I have been implementing sitemap functionality. Sitemap that I am implement list all pages that are not hidden for search - page have custom checkbox on page properties named `Hide in search` - if set to true page will not be included in the sitemap. To get all pages for the sitemap I use the following query:

SELECT * FROM [cq:Page] WHERE ISDESCENDANTNODE('/content/site-root') AND ([jcr:content/hideInSearch]='false' OR [jcr:content/hideInSearch] IS NULL)

To keep the query fast I wanted to extend the /oak:index/cqPageLucene index with an entry for hideInSearch:

<jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0"
  jcr:primaryType="nt:unstructured"
  name="jcr:content/hideInSearch"
  nullCheckEnabled="{Boolean}true"
  propertyIndex="{Boolean}true">
</jcr:root>

Unfortunately after reindexing I see that query is traversing resources to match the IS NULL condition (If I remove that condition from query there is no traversal warning message in the logs). I checked the Lucene index documentation page and it seems that nullCheckEnabled property should fix that issue but it does not. I also checked that traversal is used also when I keep only the null check, that is:

SELECT * FROM [cq:Page] WHERE ISDESCENDANTNODE('/content/site-root') AND [jcr:content/hideInSearch] IS NULL

Do you know what I am doing wrong or what needs to be done resolve that issue ? Thanks for your help in advance.

Currently the only solution I have is to make sure that all pages have value assigned to hideInSearch property (setting default value on template level + updating the existing content with groovy script). This can be done quire easily but still it would be great to understand what's wrong with my index definition.

Cheers!

Jörg_Hoh · 7/8/20

Hi,

For the sitemap I wouldn't use search, because it doesn't comes with benefits. For the JCR query you need to deal with the query (which just iterates through the index) and after that you still need to lookup all remaining pages from the repo. Assuming that you don't have much pages which should not appear in the sitemap a simple traversal of the content tree is easier to implement (no custom index, simple traversal) and has about the same runtime performance.

View solution in original post

vanegi · 7/8/20

Hi Bartosz,

Yes, using nullCheckEnabled property should suffice the constraint here.

For the query "SELECT * FROM [cq:Page] WHERE ISDESCENDANTNODE('/content/site-root') AND [jcr:content/hideInSearch] IS NULL", below is the structure for index definition:

- compatVersion = 2
- async = "async"
- jcr:primaryType = oak:QueryIndexDefinition
- evaluatePathRestrictions = true
- type = "lucene"
+ indexRules
+ cq:Page
+ properties
+ hideInSearch
- name = "jcr:content/hideInSearch"
- propertyIndex = true
- nullCheckEnabled = true

I would also suggest to include some aggregate rules to the index (/oak:index/testIndex/aggregates/cq:Page) to include the contents of descendant nodes as well and make it more optimize.

Thanks,

Vaishali

Jörg_Hoh · 7/8/20

Hi,

For the sitemap I wouldn't use search, because it doesn't comes with benefits. For the JCR query you need to deal with the query (which just iterates through the index) and after that you still need to lookup all remaining pages from the repo. Assuming that you don't have much pages which should not appear in the sitemap a simple traversal of the content tree is easier to implement (no custom index, simple traversal) and has about the same runtime performance.

Adobe Experience Manager Sites & More

OAK index for null checks (nullcheckenabled)

Learn

Documentation

Community

Support

Resources

Adobe account

Adobe