Replicate 200K nodes to publish and preview without blocking SCD queue in AEMaaCS | Community
Skip to main content
Level 2
May 9, 2024
Solved

Replicate 200K nodes to publish and preview without blocking SCD queue in AEMaaCS

  • May 9, 2024
  • 4 replies
  • 1927 views

Hi Team,

 

I'm looking for a solution to replicate large number of nodes from author to publish server without impacting performance. In my AEM author instance there is a folder under /etc/com which consists of approx. 200K nodes. There is a business requirement for that I need these nodes in publish and preview servers as well. When we tried to replicate this folder in non-prod environment, it completely blocked the distribution queue (SCD) for hours. At that moment any page replication was taking more than minute. Few cases it was more than 15 minutes.

 

So I'm looking for a solution to replicate these huge number of nodes to publish and preview servers without impacting server performance or blocking the SCD queue.

 

Any suggestion would be appreciated! Thanks

 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by prithwi

Hi, 

Responding to this thread to close it with the solution we opted. We tried many things but finally we have changed the data storage structure in JCR, i.e., instead of storing the nodes in a single folder, we have divided those into multiple sub-folders. This restructuring helped us to select one folder at a time and replicate it one-by-one. We wrote a small program to perform this replication task. 

To group the nodes into different subfolder we use a simple logic. Every node denotes by a unique number so we have divided it by 10000 and created subfolder with the quotient value and place that node under it. Though this logic is not very efficient or group equally but solved our problem. 

 

Thanks to all community members.  

4 replies

anupampat
Community Advisor
Community Advisor
May 9, 2024

Hi @prithwi ,

 

You can make use of Manage Controlled Processes, you will need to write the code and use batches. This will make sure you retail performance and not put load on the system. You can explore the APIs. More info -https://kiransg.com/tag/mcp/

Thanks.

prithwiAuthor
Level 2
May 10, 2024

Thanks @anupampat  for your solution. This is a one time task so MCP will not be useful once this task is done. I'm looking for solution like some tool which makes this process easy or if I can make this replication process in low priority etc. 

 

I thought of using package replication. But due to large size, package creation itself failed. 😞 

joerghoh
Adobe Employee
Adobe Employee
May 10, 2024

Have you tried to create a content package of that path, and then replicate the package?  Installing that package might definitely take a while, so I am not sure if that is acceptable.

 

 

prithwiAuthor
Level 2
May 12, 2024

Hi @joerghoh,

Yes I tried to create content package but it stuck for hour and finally failed. So, my idea of package replication did not work. That's why I'm looking for alternate solution. 

joerghoh
Adobe Employee
Adobe Employee
May 13, 2024

I assume that it can take a lot of time, but it should not have failed. How did you determine that it failed?

kautuk_sahni
Community Manager
Community Manager
May 17, 2024

@prithwi  Did you find the suggestions from users helpful? Please let us know if more information is required. Otherwise, please mark the answer as correct for posterity. If you have found out solution yourself, please share it with the community.

Kautuk Sahni
prithwiAuthor
Level 2
May 17, 2024

Hi @kautuk_sahni, no definite solution yet. However I'm trying to import the data in publishers directly. I'll respond back if this works by today. 

prithwiAuthorAccepted solution
Level 2
June 20, 2024

Hi, 

Responding to this thread to close it with the solution we opted. We tried many things but finally we have changed the data storage structure in JCR, i.e., instead of storing the nodes in a single folder, we have divided those into multiple sub-folders. This restructuring helped us to select one folder at a time and replicate it one-by-one. We wrote a small program to perform this replication task. 

To group the nodes into different subfolder we use a simple logic. Every node denotes by a unique number so we have divided it by 10000 and created subfolder with the quotient value and place that node under it. Though this logic is not very efficient or group equally but solved our problem. 

 

Thanks to all community members.  

joerghoh
Adobe Employee
Adobe Employee
June 20, 2024

I assume that the bottleneck in the initial case was the child node list, which needs to be maintained as many node types support only ordered child lists. If the ordering wouldn't matter at all, a nodetype like "oak:unstructured" would have helped, because there the updates of the childnode list would not be the bottleneck any more (there would be none).

 

but thanks for posting the approach you have choosen.