Expand my Community achievements bar.

Join us in celebrating the outstanding achievement of our AEM Community Member of the Year!
SOLVED

Relevance in TagManager.find()?

Avatar

Level 5

What order are the results returned in from TagManager.find()?

 

I am trying to implement a search for multiple tags, where pages which match more tags appear higher in the list, and pages with fewer matching tags appear lower in the list.

 

For example, if I am searching for the tags "Apple", "Banana", and "Cherry": If any page has all three of these tags, it should appear before any page which has any two of the tags, which appears before any page with only 1 of the tags.  Is this possible using any of the AEM APIs?

1 Accepted Solution

Avatar

Correct answer by
Level 5

I have discovered that I can use fulltext queries against the tags and then sort by jcr:score.  I'm leaving this here in the hopes that someone else finds this useful in the future. The idea is that this query powers a "Related Pages" component that exists at the bottom of my blog article pages in my site.

 

Here's the query that I am using:

 

 

path=/content/mysite/home
type=cq:Page
p.hits=full
fulltext=true # Not sure if this is required or if this is searching for the word 'true'

# I have a field on all pages for "Hide from Search"
1_group.1_boolproperty=@jcr:content/hide 
1_group.1_boolproperty.value=false

# exclude the current page from the search, as we're looking for pages related to this page
2_group.p.not=true
2_group.path=/content/mysite/home/current/page
2_group.path.self=true

# search for tags or specific template... each match increases the score
3_group.p.or=true
3_group.1_fulltext=mysite:tag-namespace/tag-1
3_group.1_fulltext.relPath=jcr:content/@cq:tags
3_group.2_fulltext=mysite:tag-namespace/tag-2
3_group.2_fulltext.relPath=jcr:content/@cq:tags
3_group.3_fulltext=mysite:tag-namespace/tag-3
3_group.3_fulltext.relPath=jcr:content/@cq:tags
3_group.4_fulltext=mysite:tag-namespace/tag-4
3_group.4_fulltext.relPath=jcr:content/@cq:tags
3_group.5_property=@jcr:content/cq:template # back-fill by template type
3_group.5_property.1_value=/conf/mysite/path/to/template/type

# each hit's score is calculated based on the fulltext results. Higher score = higher relevancy
4_orderby=@jcr:score
4_orderby.sort=desc

 

 

There are a few nuances here:

  1. For whatever reason, fulltext searching on tags requires the relPath of `jcr:content/@cq:tags` ... not sure why the @ is required at the cq:tags and not the jcr:content.  Actually I'm not sure what the @ represents and when and where it's necessary.  Maybe someone can chime in and let me know?
  2. I am opening an OR condition and doing a bunch of full text searches with the last condition be a template filter.  This achieves the back-fill effect where i'm always guaranteed to get results so long as pages exist
  3. You can sort by @jcr:score, but it only really works if you are enabling fulltext searching.  In this case, I am just fulltext searching the name of the tag in the tags field

 

With some testing on my local machine, this appears to achieve the correct result.

View solution in original post

7 Replies

Avatar

Community Advisor

Hi @dylanmccurry ,

 

I think the best way to approach this is the QueryBuilder API.

  1. You can create a query in which you can mention an "OR" for the tags.
  2. And then once you get the result, you can get the hits (each hit).
  3. And then traverse through them
  4. Get the score for each hit, and compare them in a comparator or manually.
  5. And then you can sort them as you like.
SearchResult result = query.getResult();


Collections.sort(result.getHits(), new Comparator<Hit>() {
  @Override
  public int compare(Hit hit1, Hit hit2) {
    int score1 = hit1.getScore();
    int score2 = hit2.getScore();
    return Integer.compare(score2, score1);
  }
});

Query Builder Reference : https://github.com/paulrohrbeck/aem-links/blob/master/querybuilder_cheatsheet.md 

 

Avatar

Level 5

I noticed the getScore() method, but when I tried using this it returned the same value for all results.  Is there a trick to getting it working?

Avatar

Community Advisor

Yes, my bad.

The getScore won't help in this, we would need to write a custom function instead of a getScore, which will check how many tags it contains.

Something like :

 

for (String tag : tags) {
    if (hit.getProperties().get("cq:tags", String[].class) != null && Arrays.asList(hit.getProperties().get("cq:tags", String[].class)).contains(tag)) {
      count++;
    }
  }

 

You can use this instead in the comparator.

 

Edit: The tags is the array of the tags which you need to check , on which you have build the query initially

Avatar

Level 5

I see where you can sort by @jcr:score, but it doesn't seem to do anything.

 

What about fulltext searching multiple keywords (e.g., each tag), on the tag field directly?  Or would that not work?

 

I will also give this a shot but I was hoping to not have to manipulate the results in memory and instead let the index handle it, as there's an unknown number of pages which could be returned and I want only the top 8 or so most relevant to the current page.

Avatar

Community Advisor

jcr:score is open to implementation ( https://developer.adobe.com/experience-manager/reference-materials/spec/jcr/1.0/6.6.5.3_jcr_score_fu... ), so you would need to write more code to support your case. 

 

Fulltext won't give you what you need, it cannot look for a certain property, it looks for keywords. 

 

You can limit the query if that is the case, if you have any specific conditions which make it relevant to the current page, you can add them in the query. 

 

If you don't have something like that, the best way I can think of is actually sending out multiple queries one by one. 

  1. The first one can have a "AND" for all the tags ( should automatically rank higher )
  2. If you have 8, you can return here, else.
  3. The second query will have "OR" for the first and second, and "AND" for the rest.
  4. Repeat this till you go over all the tags, and keep appending results as you go along.
  5. If there are still not 8, then you can go for the OR, and sort by @jcr:path

I think you can get a loop going which will keep replacing "OR" for "AND", but it also needs lots of code.

My recommendation:

Try to get all the pages, it should be fine, if they are slow, you should create a custom index. 

 

 

Avatar

Community Advisor

Hi @dylanmccurry 

I have tried with your scenario and working as expected with below code.

You can pass third parameter as true in .find() method.

resourceResolver = request.getResourceResolver();

//Tags absolute path as Tag ID
// You can pass tag ID as well
String[] allTags = {"we-retail:activity/biking", "we-retail:activity/hiking", "we-retail:activity/running"};

//agManager instance
TagManager tagManager = resourceResolver.adaptTo(TagManager.class);

//Range Iterator
RangeIterator<Resource> resourceRangeIterator = tagManager.find("/content/learning", allTags, true);

while (resourceRangeIterator.hasNext()) {
Resource result = resourceRangeIterator.next();
String path = result.getPath();
//Custom Code Implementation
}

Regards,

Shiv

 

Shiv Prakash

Avatar

Correct answer by
Level 5

I have discovered that I can use fulltext queries against the tags and then sort by jcr:score.  I'm leaving this here in the hopes that someone else finds this useful in the future. The idea is that this query powers a "Related Pages" component that exists at the bottom of my blog article pages in my site.

 

Here's the query that I am using:

 

 

path=/content/mysite/home
type=cq:Page
p.hits=full
fulltext=true # Not sure if this is required or if this is searching for the word 'true'

# I have a field on all pages for "Hide from Search"
1_group.1_boolproperty=@jcr:content/hide 
1_group.1_boolproperty.value=false

# exclude the current page from the search, as we're looking for pages related to this page
2_group.p.not=true
2_group.path=/content/mysite/home/current/page
2_group.path.self=true

# search for tags or specific template... each match increases the score
3_group.p.or=true
3_group.1_fulltext=mysite:tag-namespace/tag-1
3_group.1_fulltext.relPath=jcr:content/@cq:tags
3_group.2_fulltext=mysite:tag-namespace/tag-2
3_group.2_fulltext.relPath=jcr:content/@cq:tags
3_group.3_fulltext=mysite:tag-namespace/tag-3
3_group.3_fulltext.relPath=jcr:content/@cq:tags
3_group.4_fulltext=mysite:tag-namespace/tag-4
3_group.4_fulltext.relPath=jcr:content/@cq:tags
3_group.5_property=@jcr:content/cq:template # back-fill by template type
3_group.5_property.1_value=/conf/mysite/path/to/template/type

# each hit's score is calculated based on the fulltext results. Higher score = higher relevancy
4_orderby=@jcr:score
4_orderby.sort=desc

 

 

There are a few nuances here:

  1. For whatever reason, fulltext searching on tags requires the relPath of `jcr:content/@cq:tags` ... not sure why the @ is required at the cq:tags and not the jcr:content.  Actually I'm not sure what the @ represents and when and where it's necessary.  Maybe someone can chime in and let me know?
  2. I am opening an OR condition and doing a bunch of full text searches with the last condition be a template filter.  This achieves the back-fill effect where i'm always guaranteed to get results so long as pages exist
  3. You can sort by @jcr:score, but it only really works if you are enabling fulltext searching.  In this case, I am just fulltext searching the name of the tag in the tags field

 

With some testing on my local machine, this appears to achieve the correct result.