Bulk replication of Pages, Tags, Product Nodes

bhanuprakashdod

24-05-2020

Hi Everyone,

 

I have a scenario to automate the replication of huge number of pages, tags, nodes  (overall 30k will be there).
which is the best process to achieve this?

 

Thanks in Advance

Accepted Solutions (1)

Accepted Solutions (1)

Theo_Pendle

MVP

24-05-2020

Hi @bhanuprakashdod,

What is your concern exactly? I can see two possibilities:

  1. You are concerned about performance. You a huge replication could take several hours and you don't want your site to be slow during that time. In this case the simple answer is just to perform the replication in a lower environment to determine how long it would take (is it 30min or 6hours?) then identify the slot in the day when you have the least amount of traffic and do it then. If you have a X publishers and a load balancer to balance the traffic, you can do it even more seamlessly by doing the replication one publisher at a time, making sure to balance traffic to other publisher instances during that period.

  2. The content is not hierarchically related, ie: you want to publish /content/site/pageA and /content/site/pageB/1 but not /content/site/pageA/1 or something like that, so it's not as simple as publishing the root of a single website. Doing the publication via the TouchUI might require hundreds or thousands of clicks and there is a large risk of human error. In this case you will need to write some backend logic to replicate your content programatically. You can do this by using the com.day.cq.replication.Replicator service to trigger resource replication. See more information on how to use the API here.

Answers (3)

Answers (3)

raghavc

26-05-2020

Adding to the suggestions shared by others, if you are unable to perform the operation outside business hours , do consider creating a new replication agent and use this replication agent to replicate the content using your custom code, this will ensure any normal content activity is not impacted. You could possibly reduce the load by replicating content in batches.

Veena_Vikram

MVP

25-05-2020

@bhanuprakashdod Agree with @Theo_Pendle , the only issue you might have here is the performance and he has suggested the best way to figure that out.

Adding to that , in my experience the best way to do it is to write custom back-end process , may be a JOB which will run at a particular time window ( so that you don't disturb the business time) , if not you can write a servlet and hit the servlet to run the process . I would say a JOB will be the best approach in this case as it will ensure that the job is completed (If it fails , it will retry for the number of times you have mentioned ) . It will let you know if the Job is completed or failed ( If still it fails after retries ) by which you can be assured that all the content replication has been successfully completed or not. I have tried to explain the benefits of Job here 😊 Just see if this helps. ✌

Ankur_Khare

MVP

24-05-2020