Replication performance issues
Hi all,
We have noticed that when our CQ5 environment is up and running for a few days, replication performance degrades significantly.
When the system is healthy, at peak moments, there's 2MB/s data transfer observed in the network between the author/publisher instances.
However when the performance degrades, network traffic is reduced to 100-200KB/s.
Heap and PermGen memory usage on both publisher and author instances are within the normal range, GC runs are normal and quick, author JVM has less than 20% CPU usage and the publisher JVM reports 50% CPU usage.
We have also increased the Sling Eventing Thread Pool's size to 100 on author instances per recommendations we have received from DayCare.
A few questions:
- Is there anyway we can increase the number of threads that an agent uses to replicate data from an author instance to a publisher instance?
- Are there any settings we can tune on publisher instances to improve performance?
- What is the purpose of the "Batch" settings of a replication agent and how do they kick in? We have set the batch size to 10,000 and the delay to 60 seconds but they don't seem to have any kind of impact on performance?
Any other way we can mitigate this issue?
Thanks in advance.