Is it recommended to store huge datasets in AEM ? | Community
Skip to main content
akashkriz005
Level 2
January 14, 2024
Solved

Is it recommended to store huge datasets in AEM ?

  • January 14, 2024
  • 2 replies
  • 1231 views

Hi Team , 

Is it recommended to store huge datasets - non-web content into AEM (around 20000 records) of data or it is recommended to use external database for storing and retrieving the data through RestFul service calls.

Can you please provide pros and cons for both the approaches ? 


Regards, 
Akash

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by aanchal-sikka

@akashkriz005 

Sharing few related excerpts:

Oak scales to large number of direct child nodes of a node as long as those are not orderable. For orderable child nodes Oak keeps the order in an internal property, which will lead to a performance degradation when the list grows too large. For such scenarios Oak provides the oak:Unstructured node type, which is equivalent to nt:unstructured except that it is not orderable.

 

Reference: https://jackrabbit.apache.org/oak/docs/dos_and_donts.html

 

  • For the aspect of reading nodes there is no impact on performance. But, if the content is ordered, the time to add/remove nodes will degrade. Also, when you use UI to browse large number of child nodes, it would be slow due to browsers and Javascript to render it

Ref: https://cqdump.joerghoh.de/2015/07/09/1000-nodes-per-folder-and-oak-orderable-nodes/ 

 

 

 

2 replies

Harwinder-singh
Community Advisor
Community Advisor
January 14, 2024

@akashkriz005

 The immediate issue with storing anything like this is going to be data organization.

Based on Adobe's recommendations, if you have more than 1000 immediate child nodes underneath a single parent node, you will start experiencing performance issues when you work with such data. You will have to take care of the organization of this 20000 records so that you don't end up in the mentioned situation. 

The other issue would be scalability issues with this organization. AEM's repository may not scale as efficiently as dedicated databases for handling large volumes of data.

Upside to storing this data in AEM would be that you will have access to AEM features like versioning, workflows, and permissions that can be leveraged for this data. Also, you in this case you don't need to worry about building any Rest services and integrating them with AEM. Since, everything is going to be in AEM, you will be saving on network calls that you would otherwise see  in case of using Rest services.

Hope this helps.

 

 

 

akashkriz005
Level 2
January 14, 2024

Thanks for the info @harwinder-singh .

Do we have any supporting documents from Adobe mentioning the same scenario like if more than 1000 immediate child nodes underneath a single parent node there will be an issue with scalability and other issues ? 

Cheers !

aanchal-sikka
Community Advisor
aanchal-sikkaCommunity AdvisorAccepted solution
Community Advisor
January 15, 2024

@akashkriz005 

Sharing few related excerpts:

Oak scales to large number of direct child nodes of a node as long as those are not orderable. For orderable child nodes Oak keeps the order in an internal property, which will lead to a performance degradation when the list grows too large. For such scenarios Oak provides the oak:Unstructured node type, which is equivalent to nt:unstructured except that it is not orderable.

 

Reference: https://jackrabbit.apache.org/oak/docs/dos_and_donts.html

 

  • For the aspect of reading nodes there is no impact on performance. But, if the content is ordered, the time to add/remove nodes will degrade. Also, when you use UI to browse large number of child nodes, it would be slow due to browsers and Javascript to render it

Ref: https://cqdump.joerghoh.de/2015/07/09/1000-nodes-per-folder-and-oak-orderable-nodes/ 

 

 

 

Aanchal Sikka
kautuk_sahni
Community Manager
Community Manager
January 15, 2024

Worth Checking this:

Kautuk Sahni