Expand my Community achievements bar.

Dive into Adobe Summit 2024! Explore curated list of AEM sessions & labs, register, connect with experts, ask questions, engage, and share insights. Don't miss the excitement.
SOLVED

compaction - which nodes are being deleted

Avatar

Level 3

We've problems with a repository growing out of control. Luckily compaction helps a lot (we're using online compaction), but I'd like to determine what is being deleted (compacted).

I'm using 

java -Dlogback.configurationFile=/tmp/logback.xml   -jar ~/Downloads/oak-run-1.4.9.jar compact repository/segmentstore/ >>/tmp/oak-run.log

with logback logging at trace level to try to determine what is being deleted. Although it seems to be pretty close, it's not really telling me the path of the nodes that are being deleted.

Reference: https://docs.adobe.com/docs/en/aem/6-2/deploy/platform/storage-elements-in-aem-6.html#Performing%20O...

Any suggestion on how to tackle this? My next try is going to be enabling repository write logging to determine the path of the nodes being written.

Thx,

Federico

1 Accepted Solution

Avatar

Correct answer by
Level 3

Opkar, 

Nope, I wasn't using the rm-all flag.

 

Jörg,

Thanks a lot for the article. I was doing exactly that. The tip of the stacktrace generation when creating a new session is quite neat!

In the end the problem was an hourly invoked process that was rebuilding some content packages and downloading them from aem. The rebuilding operation increases the size of the repository by the same amount as the package size (which is over one GB). I suppose this is because of the appending nature of the JCR implementation.

Generating these packages less often + online compaction should take care of this.

Thanks for your help!

Federico

View solution in original post

6 Replies

Avatar

Administrator

Hi

Please read this community article that covers most about Compaction.

Link:- http://www.aemcq5tutorials.com/tutorials/online-offline-tar-compaction-in-aem/

There are two ways to run tar compaction in aem Online Tar Compaction and Offline Tar Compaction. Below topics are covered in this tutorial:-

  • Steps to Run Online Compaction in AEM
  • Steps to Run Offline Compaction in AEM
  • Increase Performance of offline tar compaction
  • Frequently Asked Questions

 

Link:- http://adobeaemclub.com/oak-tarmk-compaction/

// TarMK Compaction

If we are using Tar files as the storage, it tends to grow in size and starts claiming disk space every time when data is created or updated as data in tar files are never overwritten rather it keeps adding new versions. To mitigate the same, AEM has garbage collection mechanism which is known as ‘Tar Compaction’ to remove the unused data and reclaim the disk space.

Link:- https://www.netcentric.biz/blog/is-your-repository-growing-rapidly-in-aem6.html

// Growth of repositories in AEM ? Here is the solution.

I hope this would help you.

~kautuk



Kautuk Sahni

Avatar

Employee

Hi Federico, 

have you informed daycare and got their support for online compaction, as the is the only way it is supported.

The increase in size of the repository is in direct relation to the activity on the instance, so if you are doing a lot of writes, changes, moves, re-indexing, replication, then this will generate the growth in your repository. What kind of activity is happening on your authoring instance? That will help you to determine where the growth is coming from.

Regards,

Opkar

Avatar

Level 3

Hi guys, thanks for the answers.

I've seen some of these articles before, however the main problem is that the repository is growing really fast (like 1gb per hour) without any activity that could explain it. 

I've tried an off-line compaction. It was relatively fast (3 minutes) and the compression was quite good, however after restarting aem the repository started to grow again. Hence I really need to investigate the root cause of the problem.

The latest suspect is the integration with adobe target. After enabling low level logging of the oak repository I saw many entries like this: "23.11.2016 06:37:37.146 [admin] [session-3003] Setting property [/etc/segmentation/adobe-target/XXX/498967/jcr:content/jcr:title]" in logs/audit.log.

I'll keep you posted.

Btw: Adobe Experience Manager (6.1.0.SP2), Apache Jackrabbit Oak 1.2.16

Best,

Federico

Avatar

Employee

How are you doing the compaction? You are not using the "rm-all" flag every time are you?

Regards,

Opkar

Avatar

Employee Advisor

Hi,

if you're repo is growing without any observable activity, you can use the mechanism described in [1] to log all write activities. That should help.

Jörg

[1] https://cqdump.wordpress.com/2016/05/24/what-is-writing-to-my-oak-repository/

Avatar

Correct answer by
Level 3

Opkar, 

Nope, I wasn't using the rm-all flag.

 

Jörg,

Thanks a lot for the article. I was doing exactly that. The tip of the stacktrace generation when creating a new session is quite neat!

In the end the problem was an hourly invoked process that was rebuilding some content packages and downloading them from aem. The rebuilding operation increases the size of the repository by the same amount as the package size (which is over one GB). I suppose this is because of the appending nature of the JCR implementation.

Generating these packages less often + online compaction should take care of this.

Thanks for your help!

Federico