Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.
SOLVED

Using jcr:score ,we are not retrieving search results with highest occurrence

Avatar

Level 2

Hi All,

We are using AEM6.5 and trying to achieve search results with highest word count(ex:insurance) on top in descending order but we see the results(webpages) displayed randomly .

Ex: Using QueryBuilder Debugger

path=/content/<project-path>
type=cq:Page
fulltext=insurance
orderby.sort=desc
orderby=@jcr:score

 Please suggest how to achieve word with highest occurrence in page to be listed on top in search results using sql2. 

 

Thanks in advance.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @GanaG,

This is correct behavior. In your example you are referring to occurrence of word equipment in browser, but what you can see in browser does not mean that is content of this specific page. Let's have a closer look on this one


where as http://localhost:4502/content/we-retail/ca/en/products/equipment.html which is displayed next shows 22 occurrence of "equipment" word .

If you will go to crx you will see that this page contains only 2 occurrence of word equipment not 22. Other occurrence you can see in the browser is content from other pages that are referenced via Product Grid component.

I do not think this is correct way to recognize how many times specific word is used, you should rely on what is inside nodes in crx instead. It's because this is used to build index, not values that can be seen on final page.

I have used simple SQL2 query for given scenario:

SELECT * FROM [cq:Page] AS s WHERE ISDESCENDANTNODE([/content/we-retail/ca/en]) and CONTAINS(s.*, 'Equipment') 

And it's giving proper results when we look what's in the repository. In terms of SQL2 query itself, you do not have to add explicitly ORDER BY [jcr:score] DESC because this is default behavior, URL to documentation that describes this:

If you want to control order of results you can utilize boost option in your index:

View solution in original post

5 Replies

Avatar

Level 2

Hi @Himanshu_Jain 

We have tried the same but the results are not consistent.

Please try below:

http://localhost:4502/libs/cq/search/content/querydebug.html?_charset_=UTF-8&query=path%3D%2Fcontent...

Result which is displayed first shows 6 occurrence of "equipment" word in browser

http://localhost:4502/content/we-retail/ca/en/equipment.html 

where as http://localhost:4502/content/we-retail/ca/en/products/equipment.html which is displayed next shows 22 occurrence of "equipment" word .

last result shows 1 occurrence of "equipment" word http://localhost:4502/content/we-retail/ca/en/experience/arctic-surfing-in-lofoten/jcr%3acontent/roo...

Ideally lucene should give results based on Term frequency only which is not happening.

Please check.

Thanks in advance

Avatar

Correct answer by
Community Advisor

Hi @GanaG,

This is correct behavior. In your example you are referring to occurrence of word equipment in browser, but what you can see in browser does not mean that is content of this specific page. Let's have a closer look on this one


where as http://localhost:4502/content/we-retail/ca/en/products/equipment.html which is displayed next shows 22 occurrence of "equipment" word .

If you will go to crx you will see that this page contains only 2 occurrence of word equipment not 22. Other occurrence you can see in the browser is content from other pages that are referenced via Product Grid component.

I do not think this is correct way to recognize how many times specific word is used, you should rely on what is inside nodes in crx instead. It's because this is used to build index, not values that can be seen on final page.

I have used simple SQL2 query for given scenario:

SELECT * FROM [cq:Page] AS s WHERE ISDESCENDANTNODE([/content/we-retail/ca/en]) and CONTAINS(s.*, 'Equipment') 

And it's giving proper results when we look what's in the repository. In terms of SQL2 query itself, you do not have to add explicitly ORDER BY [jcr:score] DESC because this is default behavior, URL to documentation that describes this:

If you want to control order of results you can utilize boost option in your index:

Avatar

Level 2

 

Hi @lukasz-m 

Using boost, specific property like jcr:title or jcr:description search results appears on top but in our case we need to display search results in descending order based on more number of occurrence(i.e max. word count in a page) of the search term in a page.

ex: If insurance occurrences in page1 20 times, in page2 10 mins, in page3 15 times and page4 25 times then we need to display the result as page4, page1, page3, page2.

Please provide your suggestion

Thanks in advance!

Avatar

Level 2

All my results have the exact same jcr:score - 0.01. Am I missing some kind of configuration to enable this?

I cannot seem to find any helpful documentation on how to ensure this is configured correctly.