Approach for Storing Huge/larger number of nodes in JCR | Community
Skip to main content
__96
Level 4
February 16, 2016
Solved

Approach for Storing Huge/larger number of nodes in JCR

  • February 16, 2016
  • 13 replies
  • 6118 views

Hi,

I am working on a very peculiar requirement where I am required to store analytical data under JCR as nodes and properties. Scenario is like we are avoiding using Adobe Analytics but we have our own analytics developed in some SAP related tool. So from CQ all we need to do is post analytical data to SAP through some webservice. Here as the requirement suggests the data we are storing will be huge. Every day at some time scheduler will run and post the recorded data from JCR Nodes to SAP.The data which we are storing in nodes is related to the download of executables that user will click and download, details which we are going to record include like username, usertype, executable name, download start time etc. Now the issue is as I understand we can only have 1000 child nodes of a node. So how can I arrange the storing of details in JCR so that it overcomes this 1000 child nodes storage limitation also where to store such records(under etc or content or where). Also wanted to know is there any way to ensure the optimization of retrieval of values from these nodes.

Thanks,

Samir 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by joerghoh

Hi Samir,

so you collect tracking data inside of AEM and then export it regularly into SAP? In this case you are using the JCR repo as a storage for quite transient data I don't think that this is a good idea from a conceptual point of view.

* Do you want to collect this data on publish systems? Because in that case you will probably store this data on each publish instance; and then your export (or SAP) needs to consolidate this data. Not a real problem, but you might loose data unless you make all publishs high-available.

* You put lot of pressure on the repo. With TarMK the write performance has also improved, but do you really want to store each data point in the repo? Please do a performance test upfront and check your KPIs.

* Your incoming data is not structured at all, so any order doesn't matter. Just use a oak:unstructured node as parent and you're fine. Just don't expect that you can check this folder with CRXDE Lite :-)

I would choose a different approach, maybe setting up a queueing service (eg. RabittMQ),  where each download will be submitted to. Then your AEM instances are stateless again and are not loaded with storing and exporting this transient data. And you have an application which can fetch the datapoints from the queue and feed it directly to SAP (either live or batched, as you like).

Jörg

13 replies

joerghoh
Adobe Employee
Adobe Employee
February 18, 2016

Hi Samir,

don't store the data inside the JCR, but rather directly into the queue.

Jörg

__96
__96Author
Level 4
February 19, 2016

Manikumar wrote...

Hi Samir,

If you want really to store the data under jcr and delete once sent to SAP team then you can use the strategy that is used by Adobe for storing user 

As we know users under the /home/users will store under folder which starts with alphabetical order and same storing strategy you can use create folder structure based on some parameter and each folder can store 1000 nodes under it.

I think this may help you :)

Thanks 

Mani Kumar K

 

Thanks,Yes Mani. Thats what I have been doing for saving products data under /etc/. breaking the products character by character and creating nodes till 5th character thereby reducing the number of immediate child nodes. 

__96
__96Author
Level 4
February 19, 2016

Jörg Hoh wrote...

Hi Samir,

don't store the data inside the JCR, but rather directly into the queue.

Jörg

 

hi Jorg,

Thanks for replying. Actually storing into JCR, I will only do if I am unable to implement the RabbitMq. Otherwise I will definitely try to implement Rabbitmq queuing. Do you have any document or reference for queuing.

Thanks,

Samir