Expand my Community achievements bar.

Dive into Adobe Summit 2024! Explore curated list of AEM sessions & labs, register, connect with experts, ask questions, engage, and share insights. Don't miss the excitement.
SOLVED

OAK index for null checks (nullcheckenabled)

Avatar

Level 2

Hi all,

 

I have been implementing sitemap functionality. Sitemap that I am implement list all pages that are not hidden for search - page have custom checkbox on page properties named `Hide in search` - if set to true page will not be included in the sitemap. To get all pages for the sitemap I use the following query: 

 

SELECT * FROM [cq:Page] WHERE ISDESCENDANTNODE('/content/site-root') AND ([jcr:content/hideInSearch]='false' OR [jcr:content/hideInSearch] IS NULL)

 

 

To keep the query fast I wanted to extend the /oak:index/cqPageLucene index with an entry for hideInSearch: 

 

<jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0"
  jcr:primaryType="nt:unstructured"
  name="jcr:content/hideInSearch"
  nullCheckEnabled="{Boolean}true"
  propertyIndex="{Boolean}true">
</jcr:root>

 

Unfortunately after reindexing I see that query is traversing resources to match the IS NULL condition (If I remove that condition from query there is no traversal warning message in the logs). I checked the Lucene index documentation page and it seems that nullCheckEnabled property should fix that issue but it does not. I also checked that traversal is used also when I keep only the null check, that is:

SELECT * FROM [cq:Page] WHERE ISDESCENDANTNODE('/content/site-root') AND [jcr:content/hideInSearch] IS NULL

 

Do you know what I am doing wrong or what needs to be done resolve that issue ? Thanks for your help in advance. 

 

Currently the only solution I have is to make sure that all pages have value assigned to hideInSearch property (setting default value on template level + updating the existing content with groovy script). This can be done quire easily but still it would be great to understand what's wrong with my index definition. 

 

Cheers!

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

Hi,

 

For the sitemap I wouldn't use search, because it doesn't comes with benefits. For the JCR query you need to deal with the query (which just iterates through the index) and after that you still need to lookup all remaining pages from the repo. Assuming that you don't have much pages which should not appear in the sitemap a simple traversal of the content tree is easier to implement (no custom index, simple traversal) and has about the same runtime performance.

View solution in original post

2 Replies

Avatar

Employee

Hi Bartosz,

Yes, using nullCheckEnabled property should suffice the constraint here.

 

For the query "SELECT * FROM [cq:Page] WHERE ISDESCENDANTNODE('/content/site-root') AND [jcr:content/hideInSearch] IS NULL", below is the structure for index definition:

 

 

- compatVersion = 2
- async = "async"
- jcr:primaryType = oak:QueryIndexDefinition
- evaluatePathRestrictions = true
- type = "lucene"
+ indexRules
+ cq:Page
+ properties
+ hideInSearch
- name = "jcr:content/hideInSearch"
- propertyIndex = true
- nullCheckEnabled = true

 

 

I would also suggest to include some aggregate rules to the index (/oak:index/testIndex/aggregates/cq:Page) to include the contents of descendant nodes as well and make it more optimize.

 

Thanks,

Vaishali

Avatar

Correct answer by
Employee Advisor

Hi,

 

For the sitemap I wouldn't use search, because it doesn't comes with benefits. For the JCR query you need to deal with the query (which just iterates through the index) and after that you still need to lookup all remaining pages from the repo. Assuming that you don't have much pages which should not appear in the sitemap a simple traversal of the content tree is easier to implement (no custom index, simple traversal) and has about the same runtime performance.