One of our customer has migrated a lot of content (assets and pages)... they have about 1.5 million assets and about 4 million site pages. Part of the sites are generated by XML Add-on.
With this much load obviously Author (AEM 6.4) is slow for authoring. And sometimes crashes.
Usually what can be done with repository of such huge sizes?
Thanks for chiming in Jorg.
The answer to most questions is "XML AddOn"
So this process (along with some workflows that run by the xml add-on) is creating a lot of nodes/content on Author and is bloating it up. Maybe it's more of a process that needs to be updated than any issues on AEM
I wonder what value comes with lots of pages, which are auto-generated and probably also auto-maintained in AEM. From a handling point-of-view, does it make sense to have them all in AEM? How are your authors interacting them? Are they at all? Or are just automatic processes working with them all the time (regenerating, publishing, deactivating)?
Same with assets.
If you cannot answer these questions, I would try to find a way to maintain all this content outside of AEM. If you manage this content only by automation, what's the value of having them in AEM?
I think your AEM instance might not be tuned properly. I have written short summary of "Sling Memory Deep Dive" here https://medium.com/the-telegraph-engineering/four-highlights-from-adaptto-2018-3781782a6b7a and a detailed session video available here adaptTo() 2018 I found it very interesting in terms of memory usage, you may want to check it.
Running compaction will obviously help and 6.4 does online compaction already so make sure it's happening.
Creating indexes based on your queries will help too.
It's really a very huge data.
I have few things in mind which could be help full?