The purpose of this article is to give a big picture of what are all the variable options available to store data in the AEM Repository and how to optimize it by following the corresponding periodical clean up mechanisms.
When I try to understand this process, came across various terms (node store, data store, compaction, clean up) but often got confused where to use what. To understand each of this things in detail, I have attached the corresponding Adobe Links in the end. In this article, tried to cover the whole big picture of using various options for Storage in AEM and how to optimize it with corresponding clean up mechanisms.
The AEM platform starting from AEM 6 is based on a Jackrabbit OAK repository (replacing the Jackrabbit 2.X repository of previous versions). This repository can be split in two different storage elements: the Node Store and the Data Store (also called Blob Store).
In Adobe Experience Manager (AEM), binary data can be stored independently from the content nodes. The binary data is stored in a data store, whereas content nodes are stored in a node store.
Node Store Vs Data Store
Node store contains all the metadata and references of all information in the repository, whereas data store contains all information bigger than a predefined size (this size is configurable; standard is 4KB). So all data that are bigger than this size, will be stored on the data store and not in node store. For example: it usually contains images, assets, and other binary data.