Fulltext search index depth | Community
Skip to main content
jlpjb
March 4, 2022
Solved

Fulltext search index depth

  • March 4, 2022
  • 1 reply
  • 2308 views

I'm an trying to add fulltext search function using Lucene,

 

Issue is the depth of the fulltext search. Let me explain this through an example. Say you have to pages that contain a text component with the term "Foo" in the text and the search Predicate {fulltext=Foo, p.offset=0, p.limit=2, path=/content/mypath, type=cq:Page, p.excerpt=true}.

 

cq:Page Page 1
   jcr:content
      par1
         component_text (search term "Foo" is here)

cq:Page Page 2
    jcr:content
      par1
         component_wrapper
               component_text (search term "Foo" is here)

 

The default index will find Page 1 but not Page 2.

 

I created a custom index increasing the number of aggregates for cq:Page (xml included below). That successfully returns Page 1 and Page 2, but it also returns all the parent pages in the tree.


Any suggestions how to resolve this?

 

<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:cq="http://www.day.com/jcr/cq/1.0"
xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0"
jcr:primaryType="oak:Unstructured"
async="async"
compatVersion="{Long}2"
name="myPageLucene"
reindex="{Boolean}false"
includedPaths="[/content/mypath]"
queryPaths="[/content/mypath]"
evaluatePathRestrictions="{Boolean}true"
reindexCount="{Long}1"
type="lucene">
<aggregates jcr:primaryType="nt:unstructured">
<cq:Page jcr:primaryType="nt:unstructured">
<include0
jcr:primaryType="nt:unstructured"
path="jcr:content"
relativeNode="{Boolean}false"/>
<include1
jcr:primaryType="nt:unstructured"
path="*/*/*"
relativeNode="{Boolean}false"/>
<include2
jcr:primaryType="nt:unstructured"
path="*/*/*/*"
relativeNode="{Boolean}false"/>
<include3
jcr:primaryType="nt:unstructured"
path="*/*/*/*/*"
relativeNode="{Boolean}false"/>
<include4
jcr:primaryType="nt:unstructured"
path="*/*/*/*/*/*"
relativeNode="{Boolean}false"/>
<include5
jcr:primaryType="nt:unstructured"
path="*/*/*/*/*/*/*"
relativeNode="{Boolean}false"/>
<include6
jcr:primaryType="nt:unstructured"
path="*/*/*/*/*/*/*/*"
relativeNode="{Boolean}false"/>
<include7
jcr:primaryType="nt:unstructured"
path="*/*/*/*/*/*/*/*/*"
relativeNode="{Boolean}false"/>
<include8
jcr:primaryType="nt:unstructured"
path="*/*/*/*/*/*/*/*/*/*"
relativeNode="{Boolean}false"/>
<include9
jcr:primaryType="nt:unstructured"
path="*/*/*/*/*/*/*/*/*/*/*"
relativeNode="{Boolean}false"/>
</cq:Page>
</aggregates>
<indexRules jcr:primaryType="nt:unstructured">
<cq:Page jcr:primaryType="nt:unstructured"/>
</indexRules>
</jcr:root>

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Anmol_Bhardwaj

If that is the query (see below) then it doesn't seem to work in the scenario I am trying to describe (there is content with "foo", if I remove the "fulltext.relpath = /" I get various hits, but that doesn't resolve the either not finding content that is too deep in the page node structure or find that content but returning also the all the parent pages as well.

 


My bad. Can you try with fulltext.relPath = . 

I have done a search with the same scenario in the past. And have been able to achieve this. I have done the same in my debugger and can see the desired results.

 

1 reply

Anmol_Bhardwaj
Community Advisor
Community Advisor
March 4, 2022

If you're just looking for a particular text authored inside a component, you don't need to set the depth.

Just change the query to :

path = <content-path>

fulltext = foo

fulltext.relPath = /

 

this will search all the jcr:properties of all the nodes.

jlpjb
jlpjbAuthor
March 7, 2022

Thanks, but that doesn't quite work. I want to be able to find the term "foo" on any page, regardless of its relative node depth from the jcr:content node (and regardless of which component contains it).  Does that make sense?

 

jlpjb
jlpjbAuthor
March 7, 2022

I was talking about the query above, I have pasted the same in the comment


If that is the query (see below) then it doesn't seem to work in the scenario I am trying to describe (there is content with "foo", if I remove the "fulltext.relpath = /" I get various hits, but that doesn't resolve the either not finding content that is too deep in the page node structure or find that content but returning also the all the parent pages as well.