Expand my Community achievements bar.

FileStore TarMK GC: compaction skipped. Not enough memory

Avatar

Level 4

Team.

 

Author and publishers are randomly failing TarMK GC failing due to lack of memory since we upgraded to Java11.

Anyone else have this issue? any suggestions?

Thanks for your assistance.

 

Last FileStore TarMK GC Results
13.12.2024 02:00:00.077 *INFO* [TarMK revision gc [/author/author65/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #22: started
13.12.2024 02:00:00.078 *INFO* [TarMK revision gc [/author/author65/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #22: estimation started
13.12.2024 02:00:00.156 *INFO* [TarMK revision gc [/author/author65/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #22: estimation completed in 78.88 ms (78 ms). Segmentstore size has increased since the last tail garbage collection from 37.6 GB (37552579584 bytes) to 51.0 GB (50996005888 bytes), an increase of 13.4 GB (13443426304 bytes) or 35%. This is greater than sizeDeltaEstimation=1.1 GB (1073741824 bytes), so running garbage collection
13.12.2024 02:00:00.157 *INFO* [TarMK revision gc [/author/author65/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #22: setting up a listener to cancel compaction if available memory on pool 'PS Survivor Space' drops below 5.5 MB (5505024 bytes) / 15%.
13.12.2024 02:00:00.157 *WARN* [TarMK revision gc [/author/author65/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #22: canceling compaction because available memory level 76.8 kB (76760 bytes) is too low, expecting at least 5.5 MB (5505024 bytes)
13.12.2024 02:00:00.157 *INFO* [TarMK revision gc [/author/author65/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #22: compaction skipped. Not enough memory

9 Replies

Avatar

Level 7

Hi @Tom_Fought ,

 

First of all, you need to increase physical memory on your disk. This warning is descriptive:
org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #22: canceling compaction because available memory level 76.8 kB (76760 bytes) is too low, expecting at least 5.5 MB (5505024 bytes) 

In addition, I would suggest to stop instances and do offline compaction https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/depl... , then Online revision Cleanup.

 

Best regards,

Kostiantyn Diachenko.

Avatar

Level 4

Thanks for the suggestions. We have plenty of free disk space. 

I'm thinking this has something to do with how much memory is being allocated with on the java command line in the start script.

My dev/qa machines have 16gb. They were running with -Xmx10gb. I have reduced it to 8gb to see if that has any impact. I tested it manually through the daily maintenance schedule and it seems to have made a difference.

Production machines have 32gb. Reducing -Xmx to 22gb from 26gb.

Time will tell.

Avatar

Level 7

I remember that we added additional JVM parameters to start command for AEM 6.5 on premise on Java 11.

 

Here is a documentation: https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/depl...

 

Java 11 Considerations
If you are running Oracle Java 11 (or generally versions of Java newer than 8), additional switches must be added to your command line when starting AEM.

The following - -add-opens switches need to be added to prevent related reflection access WARNING messages in the stdout.log

--add-opens=java.desktop/com.sun.imageio.plugins.jpeg=ALL-UNNAMED --add-opens=java.base/sun.net.www.protocol.jrt=ALL-UNNAMED --add-opens=java.naming/javax.naming.spi=ALL-UNNAMED --add-opens=java.xml/com.sun.org.apache.xerces.internal.dom=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/jdk.internal.loader=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED -Dnashorn.args=--no-deprecation-warning


Additionally, you need to use the -XX:+UseParallelGC switch to mitigate any potential performance issues.
Below is a sample of how the additional JVM parameters should look like when starting AEM on Java 11:

-XX:+UseParallelGC --add-opens=java.desktop/com.sun.imageio.plugins.jpeg=ALL-UNNAMED --add-opens=java.base/sun.net.www.protocol.jrt=ALL-UNNAMED --add-opens=java.naming/javax.naming.spi=ALL-UNNAMED --add-opens=java.xml/com.sun.org.apache.xerces.internal.dom=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/jdk.internal.loader=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED -Dnashorn.args=--no-deprecation-warning


Finally, if you are running an instance upgraded from AEM 6.3, make sure the following property is set to true under sling.properties:felix.bootdelegation.implicit

Avatar

Level 4

Thank you for your response. 

I have confirmed my AEM servers have been running with all the -add-opens, nashorn, and UseParallelGC.

I checked for felix.bootdelegation.implicit and it is not in sling.properties. Checking the Apache Felix website, it is stated:

felix.bootdelegation.implicit - Specifies whether the framework should try toguess when to implicitly boot delegate to ease integration with external code.The default value is TRUE.

So I don't think I need to make that change.

Out of curiosity, how much physical memory is on your aem server hosts, and what are the command line settings for -Xmx, -Xms, and -XX:ReservedCodeCacheSize?

Thanks

Hi @Tom_Fought,

 

For AEM 6.5 (on-premise setup), the JVM memory settings and configurations should account for the distinct roles of author and publisher instances. These settings aim to optimize performance for content authorship, replication, and content delivery.

 

Memory Recommendations
Physical Memory
Actually this configuration depends on your project traffic.

  • AEM Author: Requires more resources as it handles workflows, content editing, and indexing.
    • Typical setup: 32 GB RAM (minimum).
    • High-traffic or workflow-intensive environments: 64 GB or more.
  • AEM Publisher: Handles end-user traffic and is optimized for high concurrency.
    • Typical setup: 16-32 GB RAM depending on traffic.

JVM Heap and Other Settings
AEM Author Instance:

 

 

-Xms16g                          # Initial heap size (50% of max heap)
-Xmx32g                          # Maximum heap size (50-70% of total physical RAM)
-XX:ReservedCodeCacheSize=512m   # JIT code cache for compiled Java code
-XX:+UseParallelGC               # Optimize for parallel garbage collection

 

 

Why?:

  • Authoring involves resource-intensive tasks like indexing, workflows, and asset processing.
  • Large heap sizes minimize frequent garbage collection (GC) pauses during content authoring and replication.

AEM Publisher Instance:

 

-Xms8g                           # Initial heap size (50% of max heap)
-Xmx16g                          # Maximum heap size (50-60% of total physical RAM)
-XX:ReservedCodeCacheSize=256m   # Less resource-intensive compared to the author
-XX:+UseParallelGC               # Balance throughput and responsiveness

 

 

Why?:

  • Publishers focus on fast content delivery and high concurrency.
  • Smaller heap sizes reduce GC pause times, ensuring smooth end-user experiences.
  •  

Best regards,

Kostiantyn Diachenko.

Avatar

Level 4

I think this issue is related to jvm memory parameters on startup. Would people please share what they use for allocating heap, etc when starting up the jvm?

Thanks.

Avatar

Administrator

@Tom_Fought Did you find the suggestions helpful? Please let us know if you require more information. Otherwise, please mark the answer as correct for posterity. If you've discovered a solution yourself, we would appreciate it if you could share it with the community. Thank you!



Kautuk Sahni

Avatar

Level 4

I am still investigating this. I tried the following parameters on a 32gb servers -Xmx28G -Xms28G -Xmn8G -XX:SurvivorRatio=4 -XX:ReservedCodeCacheSize=256m -XX:+UseParallelGC

The result was the application ran out of memory. I will try suggestions above, but remember the servers are not running TarMK GC successfully. PS Survivor space appears to never be available.

The issue you're encountering suggests that a memory leak or inefficient memory usage in your codebase might be causing the AEM application to exhaust its memory allocation despite having ample resources. 

 

Verify Code for Memory Leaks
Memory leaks often arise from improperly managed objects, such as:

  • Sling Models or OSGi services with static fields or poorly scoped dependencies.
  • Large collections (e.g., HashMaps or Lists) that continue to grow.
  • Unclosed resources (e.g., InputStreams, Database Connections).
  • Improper handling of workflows or queries with large result sets.

Actions:

  1. Use tools to analyze heap dumps and identify objects retaining memory unexpectedly.
  2. Check logs for out-of-memory error stack traces (java.lang.OutOfMemoryError) and analyze them for clues.

If TarMK garbage collection (GC) is not running successfully ensure enough free disk space is available. TarMK requires additional space during compaction.