Expand my Community achievements bar.

SOLVED

AEM's Indexing Landscape: Lucene, Solr, Remote Servers, and Oak

Avatar

Level 2

I would want to understand the various types of indexes available in AEM and when to use them effectively.

Could you share some light on the following aspects?

  1. Lucene Index: What is the role of Lucene indexes in AEM, and in what scenarios would you recommend using them? Can you provide a real-world example where a Lucene index is the optimal choice?

  2. Solr Index: How does Solr indexing differ from Lucene, and what are the advantages of using a Solr index? Can you share a practical use case where a Solr index, integrated with a remote Solr server, makes a significant impact?

  3. Remote Solr Server: When does it make sense to configure AEM to use a remote Solr server for indexing and search? Can you share an example where leveraging a remote Solr server has provided benefits in terms of scalability or advanced search capabilities?

  4. Oak Index: What role do Oak indexes play in AEM, and how do they differ from Lucene and Solr? Can you describe a scenario where an Oak index is the most suitable choice for optimizing content queries in AEM?

Feel free to provide insights, examples, or best practices to help demystify the use cases for these indexing options in the AEM ecosystem. Your expertise will help guide newcomers like me!

@aanchal-sikka 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @aem101 

Lucene IndexLucene index supports both property constraints and full text constraints. Based on the index definition, it can be used to evaluate property constraints, full-text constraints, path restrictions and sorting. Useful for queries involving full-text conditions

https://myaemlearnings.blogspot.com/2020/05/lucene-index-in-aem-part-1.html

 

  • Lucene indexes offer many more features than property indexes. For example, a property index can only index a single property while a Lucene index can include many. 
  • Lucene indexes are asynchronous. While this offers a considerable performance boost, it can also induce a delay between when data is written to the repository and when the index is updated. If it is vital to have queries return 100% accurate results, a property index would be required.
  • By virtue of being asynchronous, Lucene indexes cannot enforce uniqueness constraints. If this is required, then a property index needs to be put in place.

Solr Index

     The purpose of the Solr index is full-text search but it can also be used to index search by path, property restrictions, and primary type restrictions. This means that the Solr index in Oak can be used for any type of JCR query.

 

UseCase: Solr should be considered when the AEM instances do not have the CPU capacity to handle the number of queries required in search intensive deployments like search driven websites with a high number of concurrent users. Alternately, Solr can be implemented in a crawler-based approach to use some of the more advanced features of the platform

 

Remote Solr Server:

 

Used for non-development, production level environments. Typically these instances take advantage of fault tolerant features of the Solr cloud. Can be scaled independently , use advanced features like zoo keeper and also act as a system for other consumers also.

https://adobecommunity.com/pdf/meetup-3/Integration_AEM_and_Apache_Solr.pdf

 

Oak Index:

 

Oak supports the indexing of content that is stored in the repository. Oak supports Lucene-based indexes to support both property and full-text constraints. If multiple indexes are available for a query, each available indexer estimates the cost of executing the query.

 

Best practices : 

https://experienceleague.adobe.com/docs/experience-manager-65/deploying/practices/best-practices-for...

 

 

View solution in original post

1 Reply

Avatar

Correct answer by
Community Advisor

Hi @aem101 

Lucene IndexLucene index supports both property constraints and full text constraints. Based on the index definition, it can be used to evaluate property constraints, full-text constraints, path restrictions and sorting. Useful for queries involving full-text conditions

https://myaemlearnings.blogspot.com/2020/05/lucene-index-in-aem-part-1.html

 

  • Lucene indexes offer many more features than property indexes. For example, a property index can only index a single property while a Lucene index can include many. 
  • Lucene indexes are asynchronous. While this offers a considerable performance boost, it can also induce a delay between when data is written to the repository and when the index is updated. If it is vital to have queries return 100% accurate results, a property index would be required.
  • By virtue of being asynchronous, Lucene indexes cannot enforce uniqueness constraints. If this is required, then a property index needs to be put in place.

Solr Index

     The purpose of the Solr index is full-text search but it can also be used to index search by path, property restrictions, and primary type restrictions. This means that the Solr index in Oak can be used for any type of JCR query.

 

UseCase: Solr should be considered when the AEM instances do not have the CPU capacity to handle the number of queries required in search intensive deployments like search driven websites with a high number of concurrent users. Alternately, Solr can be implemented in a crawler-based approach to use some of the more advanced features of the platform

 

Remote Solr Server:

 

Used for non-development, production level environments. Typically these instances take advantage of fault tolerant features of the Solr cloud. Can be scaled independently , use advanced features like zoo keeper and also act as a system for other consumers also.

https://adobecommunity.com/pdf/meetup-3/Integration_AEM_and_Apache_Solr.pdf

 

Oak Index:

 

Oak supports the indexing of content that is stored in the repository. Oak supports Lucene-based indexes to support both property and full-text constraints. If multiple indexes are available for a query, each available indexer estimates the cost of executing the query.

 

Best practices : 

https://experienceleague.adobe.com/docs/experience-manager-65/deploying/practices/best-practices-for...