on high volume of activation tree replication agents stuck or going in block with high consumptions of machine resources. This lead to a restart of the AEM instances. To avoid this could be useful to enable "Batch Mode" on the replication agents?
Or as alternative upgrade the architecture with a replication-tier (merely a publish instances) between author and publish as follow:
- one replication agents on author (reduce replication event on author so author could not stuck)
- disable update assets on replication-tier (the assets are already elaborated from author)
- enable replication batch mode on the author and replication tier to stagger performance impact using the threshold level defined
Thanks a lot for any insight
First of all, the outlined behavior sounds somewhat abnormal and the root cause should be analyzed and addressed.
That being said, the two approaches to deal with the situation that you mention are:
Option 1 is a quick and low-effort approach and therefore I would recommend to give it a try although I'm not convinced that it will lead to a final resolution of the issue (it will probably bring some improvement, though). It also comes with a couple of implication, e. g. pages not being replicated immediately after a content author hits the "publish" button. Depending on the replication traffic, it may take some time until the replication is actually executed. These are implications that need to be discussed with the business side and aligned with according requirements of your project. For more information on batch replication, please refer to the documentation of this feature.
Option 2 is a commonly used pattern for high replication volumes and will definitely address the issue. So if you have high replication traffic, this is a totally valid option to evaluate.
Hope that helps!
enable replication batch mode on the author and replication tier to stagger performance impact using the threshold level defined as it would be the best approach in current scenerio.
We use batch mode on all our lower environments. If there is no business need for immediate content activation than this is useful feature and it does improve replication performance.