We have an AEM 6.2 instance with Hotfix 17578 (cq-6.2.0-hotfix-17578) installed so we are on Oak 1.4.17 and CFP18.
We run garbage collection, version cleanup, and workflow cleanup daily. We have a separate datastore and segment store. We have performed offline compaction but that only applies to the segmentstore.
The disk usage report in AEM (/etc/reports/diskusage.html) reports we are using ~18 GB of data. However, our datastore has grown to ~170 GB of data. I cannot find any way to figure out how to reduce this or where this is coming from.
From this page (Analyze unusual repository growth ) I can see we even have a few files between 1-6 GB but there is definitely no file in our DAM or packages that big. The entire DAM according to the usage report is ~6 GB.
What can I do to reduce the size of our datastore? What is causing this problem?
Solved! Go to Solution.
Views
Replies
Total Likes
Here is what I have done that seems to have resolved the issue. If someone could let me know if there is some issue I am not seeing here but otherwise, it seems to have worked beautifully.
java -jar crx2oak-1.8.6-all-in-one.jar segment-old:/content/aem/crx-quickstart/repository segment-old:/content/backup/ --include-path=/ --src-datastore=/content/aem/datastore --datastore=/content/backup/datastore/
Running this command to pretend as if I am upgrading, but using "segment-old" for source and target, I was able to create a repository that is ~11 GB compared to the previous ~170 GB and all seems to work successfully after.
The only concern is this page (InvalidFileStoreVersionException migrating from older version to 6.3 using CRX2Oak ) only mentions using "segment-old" for the source repository but it doesn't seem to cause a problem with the destination repository.
I am not sure what the best way to provide a report about the file differences. I can tell you a quick summary comparing a difference in files between the old and new datastore:
diff -qrN datastore/ datastore-bak/ | wc -l
534635
The AEM Usage Report (/etc/reports/diskusage.html) is basically identical -- I do not have an exact comparison on numbers, but the difference is negligible for our purposes.
I would expect that the disk-usage report remains as is, because this report just iterates through the repository and sums up the size of the properties and binaries. It does not lookup files and such on the filesystem (thus does not know anything about segmentstore, datastores, shared datastores and such).
Just checking: Have you executed an offline compaction on your original instance before you ran the DSGC?
Jörg
Yes, I ran offline compaction multiple times and even tried removing all checkpoints at some point as recommended above.
Hm, very strange then.
Can you report this to Adobe support (if not done already), just to let them know about the situation you encountered and how you solved it?
thanks,
Jörg
Sure. I have just submitted a ticket. Thanks for everyone's help. I will update if Adobe comes back with anything insightful.