Expand my Community achievements bar.

SOLVED

Keep an object in CQ5 session for all users

Avatar

Level 3

I need to read and fetch values from an XML file kept in the DAM. The file is 111 MB in size and takes forever to unmarshall and then equally long time to find the value I am looking for.

I am using JaxB to unmarshall the XML file and then iterating through the list. But most of the times I face timeouts and heapspace error while parsing it.

 

Is there any way to keep the unmarshalled object in the session and fetch it from there whenever a user needs to search some data in it ?

Any suggestions will be appreciated.

1 Accepted Solution

Avatar

Correct answer by
Level 10
After talking with some Co workers; that volume  of data on a daily  basis  should be stored outside of JCR. For example place the data in MySql.       

View solution in original post

5 Replies

Avatar

Level 8

If Scott's suggestion to use the JCR doesn't work for you solving this problem is really more of a Java issue than a AEM on. If you create and OSGI service to handle searching the data you could unmarshall the file on system start up, store the unmarshalled object in a local variable of the service, and then any code that needs the XML data can call the OSGI service. In addition you might consider integrating redis or memcache (or something similar) to enhance the performance of you search of unmarshalled object.  

Avatar

Level 10
You can also write an OSGI service to read the xml values and store the values in the JCR. When you need a value, retrieve the values from the JCR. That way, you are not parsing  the XML when you need a value. 

Avatar

Level 3

I thought of doing that. So that we can make a JCR query every time we need to fetch some data.

But the file will have about 180,000 - 250,000 items (each item having a specific set of fields). Don't you think creating such a huge number of nodes in JCR will cause extra load on the JCR and can cause indexing issues in AEM6 ?
Just a hunch. I am not 100% sure of the impact of these many nodes being created/recreated daily. (The XML changes daily)

Avatar

Level 10
  With this much data on a daily basis; you may want to consider writing an AEM sling scheduler service to run 1 a day and update nodes. This way, the data gets automatically  updated. For information about a scheduler service, see      https://helpx.adobe.com/experience-manager/using/aem-first-components1.html

Avatar

Correct answer by
Level 10
After talking with some Co workers; that volume  of data on a daily  basis  should be stored outside of JCR. For example place the data in MySql.