Adobe Experience Manager Sites & More

BHRoadKing · 3/29/24

I am using queryBuilder in AEM 6.5 - I need to query web pages AND PDF files for the search term and order the results so that the pages and pdf files are all listed together ordered by relevance. But all I can get is the results grouped by type.

As for relevance - everything seems to have a score of 0.01 - and the only post I could find on this says that value is hard-coded in jackrabbit oak - which seems silly and the post includes no explanation of why or how to overcome this.

I would LOVE to have someone explain to me how to write a query that will give me one set of results of all types sorted by relevance and NOT grouped by type.

Any suggestions?

Raja_Reddy · 3/29/24

Hi @BHRoadKing

To query web pages and PDF files together and order the results by relevance in AEM 6.5, you can use the QueryBuilder API and leverage the full-text search capabilities of Apache Jackrabbit Oak

import com.day.cq.search.QueryBuilder;
import com.day.cq.search.Query;
import com.day.cq.search.PredicateGroup;
import com.day.cq.search.result.SearchResult;

// Inject the QueryBuilder service
@Reference
private QueryBuilder queryBuilder;

// Perform the search
String searchTerm = "your search term";
int limit = 10; // Number of results to retrieve

Map<String, String> predicates = new HashMap<>();
predicates.put("path", "/content"); // Path to search within
predicates.put("type", "cq:PageContent"); // Type of content to search (e.g., cq:PageContent, dam:Asset)
predicates.put("group.p.or", "true"); // Combine predicates with OR operator

// Add full-text search condition for web pages
predicates.put("group.1_fulltext", searchTerm);
predicates.put("group.1_fulltext.relPath", "jcr:content");

// Add full-text search condition for PDF files
predicates.put("group.2_fulltext", searchTerm);
predicates.put("group.2_fulltext.relPath", "jcr:content/metadata");

Query query = queryBuilder.createQuery(PredicateGroup.create(predicates), session);
query.setHitsPerPage(limit);

SearchResult result = query.getResult();

// Process the search results
for (Hit hit : result.getHits()) {
    // Process each search result
}

we use the QueryBuilder service to create a query with multiple predicates. We specify the path to search within (/content in this case) and the type of content to search (e.g., cq:PageContent for web pages, dam:Asset for PDF files).

To perform a full-text search on web pages, we add a predicate with the fulltext property set to the search term and the relPath property set to jcr:content. This searches for the search term within the content of the web pages.

Similarly, to perform a full-text search on PDF files, we add a predicate with the fulltext property set to the search term and the relPath property set to jcr:content/metadata. This searches for the search term within the metadata of the PDF files.

By combining these predicates with the group.p.or property set to true, we ensure that the search results include both web pages and PDF files and are ordered by rele

vance.

View solution in original post

Raja_Reddy · 3/29/24

Hi @BHRoadKing

To query web pages and PDF files together and order the results by relevance in AEM 6.5, you can use the QueryBuilder API and leverage the full-text search capabilities of Apache Jackrabbit Oak

import com.day.cq.search.QueryBuilder;
import com.day.cq.search.Query;
import com.day.cq.search.PredicateGroup;
import com.day.cq.search.result.SearchResult;

// Inject the QueryBuilder service
@Reference
private QueryBuilder queryBuilder;

// Perform the search
String searchTerm = "your search term";
int limit = 10; // Number of results to retrieve

Map<String, String> predicates = new HashMap<>();
predicates.put("path", "/content"); // Path to search within
predicates.put("type", "cq:PageContent"); // Type of content to search (e.g., cq:PageContent, dam:Asset)
predicates.put("group.p.or", "true"); // Combine predicates with OR operator

// Add full-text search condition for web pages
predicates.put("group.1_fulltext", searchTerm);
predicates.put("group.1_fulltext.relPath", "jcr:content");

// Add full-text search condition for PDF files
predicates.put("group.2_fulltext", searchTerm);
predicates.put("group.2_fulltext.relPath", "jcr:content/metadata");

Query query = queryBuilder.createQuery(PredicateGroup.create(predicates), session);
query.setHitsPerPage(limit);

SearchResult result = query.getResult();

// Process the search results
for (Hit hit : result.getHits()) {
    // Process each search result
}

we use the QueryBuilder service to create a query with multiple predicates. We specify the path to search within (/content in this case) and the type of content to search (e.g., cq:PageContent for web pages, dam:Asset for PDF files).

To perform a full-text search on web pages, we add a predicate with the fulltext property set to the search term and the relPath property set to jcr:content. This searches for the search term within the content of the web pages.

Similarly, to perform a full-text search on PDF files, we add a predicate with the fulltext property set to the search term and the relPath property set to jcr:content/metadata. This searches for the search term within the metadata of the PDF files.

By combining these predicates with the group.p.or property set to true, we ensure that the search results include both web pages and PDF files and are ordered by rele

vance.

Santhosh_Talepalle · 3/29/24

Thanks for the detailed explanation.

aanchal-sikka · 3/31/24

@BHRoadKing

Recommended Approach: Use separate queries which use specific indexes

I would suggest using separate queries for searching in Page and PDF. They have different node types. Page is cq:Page and PDF is dam:Asset.

If we use separate queries, each can target specific indexes based on nodeTypes. The indexes created this way would be more optimized and query time would be optimal.

Alternate approach:

If you still intend to use a single query, you would also need to assure that index is capable of dealing with both cq:Page and dam:Asset. That is, it should of parent type to both. As denoted in the screenshots below, you would need to create it of type nt:base.

Issue: nt:base is a supertype of majority of the nodetypes, thus this index can bloat up soon. you would need to make sure you keep it lean by indexing only specific properties, assuring you index only specific paths etc.

Fulltext would work in either of the approaches, but you need to carefully choose between one or multiple query approach.

Aanchal Sikka