AEM 6.5.15 Reindexing loop issue | Community
Skip to main content
Level 2
April 12, 2023
Solved

AEM 6.5.15 Reindexing loop issue

  • April 12, 2023
  • 2 replies
  • 1859 views

Hi everyone, 

We have an AEM project where the repository size is ~ 90 GB (recently it started to grow rapidly to 190GB).

AEM version - 6.5.15

Apache Jackrabbit Oak - 1.22.13

On the project, we faced issues with very long re-indexing of indexes after the deployment. This process took ~ 4 hours and consumed all RAM (24 GB) and as a result, the instance was stuck. It didn't respond to any operation and OutOfMemoryError (java.lang.OutOfMemoryError: GC overhead limit exceeded) was thrown. We killed the AEM process and restarted it. 

After restarting of AEM, reindexing started again:

 

06.04.2023 16:38:02.192 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/someIndex1 => Indexed 10000 nodes in 2.506 s ... 06.04.2023 16:38:02.253 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/workflowDataLucene => Indexed 10000 nodes in 2.585 s ... 06.04.2023 16:38:02.255 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/ntBaseLucene => Indexed 10000 nodes in 2.560 s ... 06.04.2023 16:38:02.292 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/someIndex2 => Indexed 10000 nodes in 2.608 s ... 06.04.2023 16:38:02.292 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/someIndex3 => Indexed 10000 nodes in 2.605 s ... 06.04.2023 16:38:02.292 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/templateIndex => Indexed 10000 nodes in 2.609 s ... 06.04.2023 16:38:02.292 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/socialLucene => Indexed 10000 nodes in 2.626 s ... 06.04.2023 16:38:02.292 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/cmLucene => Indexed 10000 nodes in 2.611 s ... 06.04.2023 16:38:02.292 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate /oak:index/nodetypeLucene => Indexed 10000 nodes in 2.604 s ... 06.04.2023 16:38:02.305 *INFO* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Incremental indexing Traversed #10000 /var/audit/com.day.cq.wcm.core.page/content/dam/path/to/image.JPG [1.19 nodes/s, 4293.90 nodes/hr] (Elapsed 2.639 s)

 

We tried several times to wait until the end of reindexing, but every time our instance was stuck and not accessible. 

After it, we decided to do offline compaction on the stopped AEM, and run all maintenance tasks on the running AEM and run offline reindexing of all Lucene indexes. The first try with 32 GB RAM failed. With 100 GB RAM is was successful and this process took 5 hours - Indexing completed and imported successfully in 4.964 h (17870089 ms). The command that I ran:

 

sudo nohup java -Dtar.memoryMapped=true -Doak.compaction.eagerFlush=true -Doak.index.ramBufferSizeMB=4096 -server -Xmx100g -Dcompaction-progress-log=5000000 -Dcompress-interval=150000000 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./ -jar compaction/oak-run-1.22.13.jar index --reindex --index-paths=/oak:index/workflowDataLucene,/oak:index/slingeventJob,/oak:index/versionStoreIndex,/oak:index/commerceLucene,/oak:index/authorizables,/oak:index/text,/oak:index/newsHighlightsIndex,/oak:index/templateIndex,/oak:index/ntFolderDamLucene,/oak:index/someIndex1,/oak:index/damAssetLucene,/oak:index/someIndex2,/oak:index/someIndex2,/oak:index/nodetypeLucene,/oak:index/ntBaseLucene,/oak:index/cqTagLucene,/oak:index/lucene,/oak:index/repTokenIndex,/oak:index/someIndex3,/oak:index/cqPageLucene,/content/project-path/oak:index/someIndex4,/content/project-path/oak:index/someIndex5,/content/project-path/markets/oak:index/lastModifiedIndex,/content/project-path/hq/de_DE/competitor/oak:index/scaleComponentIndex --read-write --fds-path=crx-quickstart/repository/repository/datastore crx-quickstart/repository/segmentstore >> compaction/oak-reindex.log 2>>compaction/oak-reindex-error.log

 

I was happy and I thought that we solved this issue, because when you ran AEM these reindexed indexes should be identified by AEM and it will import them. 

However, after starting AEM with 120 GB of RAM it again decided to run indexing and incremental reindexing. This process took 7.808 h and instance was stuck because 120 GB of RAM WERE CONSUMED

Please, suggest how to solve this issue with repetitive reindexing of the large repository.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by koha26

We noticed, that re-indexing by AEM has a very low speed on traversing nodes - up to 100-200 nodes per second. Offline re-indexing - thousands of nodes per second. 

We managed to fix this issue by the next plan:

  1. Create checkpoint
  2. Stop AEM
  3. Run oak-run in console mode:

     

 

sudo java -jar compaction/oak-run-1.22.13.jar console --read-write --fds-path=crx-quickstart/repository/repository/datastore crx-quickstart/repository/segmentstore

 

Run the groovy script in oak-run console. This script marks that indexes are actual for now, no need to reindex them.

 

 

import org.apache.jackrabbit.oak.api.Type import org.apache.jackrabbit.oak.commons.PathUtils import org.apache.jackrabbit.oak.plugins.memory.ArrayBasedBlob import org.apache.jackrabbit.oak.plugins.memory.PropertyStates import org.apache.jackrabbit.oak.spi.commit.CommitInfo import org.apache.jackrabbit.oak.spi.commit.EmptyHook import org.apache.jackrabbit.oak.spi.state.ChildNodeEntry import org.apache.jackrabbit.oak.spi.state.NodeBuilder import org.apache.jackrabbit.oak.spi.state.NodeState import org.apache.jackrabbit.oak.spi.state.NodeStateUtils import org.apache.jackrabbit.oak.spi.state.NodeStore updatedCheckpoint="<enter created checkpoint here>"; indexLane = "async" NodeBuilder childBuilder(NodeBuilder root, String path){ NodeBuilder nb = root; for (String nodeName : PathUtils.elements(path)){ nb = nb.child(nodeName); } return nb; } ns = session.store indexPath = "/:async" nodeState = NodeStateUtils.getNode(ns.root, indexPath) println "Info $nodeState" builder = ns.root.builder() file = childBuilder(builder, indexPath) file.setProperty(indexLane, updatedCheckpoint, Type.STRING) ns.merge(builder, EmptyHook.INSTANCE, CommitInfo.EMPTY) newNodeState = NodeStateUtils.getNode(ns.root, indexPath) println "updated $newNodeState"

 

4. Start AEM.
https://aem.author.host:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3Dasync%2Ctype%3DIndexStats
Status should be "done".
LastIndexedTime should be updated.

5. Create checkpoint

6. Run oak-run.jar reindexing (https://experienceleague.adobe.com/docs/experience-manager-65/deploying/deploying/oak-run-indexing-usecases.html?lang=en#reindexsegmentnodestore ) with created checkpoint

7. Start AEM

2 replies

Saravanan_Dharmaraj
Community Advisor
Community Advisor
April 12, 2023

@koha26 IMO, Considering time consuming process in testing and figuring out, i would suggest to create the ticket with Adobe on finding solution. They might clone the instance and do the troubleshooting with heapdump on the engineering side. Hope that works!

koha26AuthorAccepted solution
Level 2
April 24, 2023

We noticed, that re-indexing by AEM has a very low speed on traversing nodes - up to 100-200 nodes per second. Offline re-indexing - thousands of nodes per second. 

We managed to fix this issue by the next plan:

  1. Create checkpoint
  2. Stop AEM
  3. Run oak-run in console mode:

     

 

sudo java -jar compaction/oak-run-1.22.13.jar console --read-write --fds-path=crx-quickstart/repository/repository/datastore crx-quickstart/repository/segmentstore

 

Run the groovy script in oak-run console. This script marks that indexes are actual for now, no need to reindex them.

 

 

import org.apache.jackrabbit.oak.api.Type import org.apache.jackrabbit.oak.commons.PathUtils import org.apache.jackrabbit.oak.plugins.memory.ArrayBasedBlob import org.apache.jackrabbit.oak.plugins.memory.PropertyStates import org.apache.jackrabbit.oak.spi.commit.CommitInfo import org.apache.jackrabbit.oak.spi.commit.EmptyHook import org.apache.jackrabbit.oak.spi.state.ChildNodeEntry import org.apache.jackrabbit.oak.spi.state.NodeBuilder import org.apache.jackrabbit.oak.spi.state.NodeState import org.apache.jackrabbit.oak.spi.state.NodeStateUtils import org.apache.jackrabbit.oak.spi.state.NodeStore updatedCheckpoint="<enter created checkpoint here>"; indexLane = "async" NodeBuilder childBuilder(NodeBuilder root, String path){ NodeBuilder nb = root; for (String nodeName : PathUtils.elements(path)){ nb = nb.child(nodeName); } return nb; } ns = session.store indexPath = "/:async" nodeState = NodeStateUtils.getNode(ns.root, indexPath) println "Info $nodeState" builder = ns.root.builder() file = childBuilder(builder, indexPath) file.setProperty(indexLane, updatedCheckpoint, Type.STRING) ns.merge(builder, EmptyHook.INSTANCE, CommitInfo.EMPTY) newNodeState = NodeStateUtils.getNode(ns.root, indexPath) println "updated $newNodeState"

 

4. Start AEM.
https://aem.author.host:4502/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3Dasync%2Ctype%3DIndexStats
Status should be "done".
LastIndexedTime should be updated.

5. Create checkpoint

6. Run oak-run.jar reindexing (https://experienceleague.adobe.com/docs/experience-manager-65/deploying/deploying/oak-run-indexing-usecases.html?lang=en#reindexsegmentnodestore ) with created checkpoint

7. Start AEM