Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.
SOLVED

AEM Replication agents - stuck

Avatar

Level 2

Hi All,

 

 on high volume of activation tree replication agents stuck or going in block with high consumptions of machine resources. This lead to a restart of the AEM instances. To avoid this could be useful to enable "Batch Mode" on the replication agents?

 

image (1).png

 

Or as alternative upgrade the architecture with a replication-tier (merely a publish instances) between author and publish as follow:

 

- one replication agents on author (reduce replication event on author so author could not stuck)

- disable update assets on replication-tier (the assets are already elaborated from author)

- enable replication batch mode on the author and replication tier to stagger performance impact using the threshold level defined

 

Thanks a lot for any insight

 

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

Hi @tommyc11112341!

First of all, the outlined behavior sounds somewhat abnormal and the root cause should be analyzed and addressed.

That being said, the two approaches to deal with the situation that you mention are:

  1. Enabling batch mode for replication
  2. Moving to a replication-tier architecture

Option 1 is a quick and low-effort approach and therefore I would recommend to give it a try although I'm not convinced that it will lead to a final resolution of the issue (it will probably bring some improvement, though). It also comes with a couple of implication, e. g. pages not being replicated immediately after a content author hits the "publish" button. Depending on the replication traffic, it may take some time until the replication is actually executed. These are implications that need to be discussed with the business side and aligned with according requirements of your project. For more information on batch replication, please refer to the documentation of this feature.

Option 2 is a commonly used pattern for high replication volumes and will definitely address the issue. So if you have high replication traffic, this is a totally valid option to evaluate.

 

Hope that helps!

 

 

View solution in original post

5 Replies

Avatar

Community Advisor

Hi @tommyc11112341 

   We use batch mode on all our lower environments. If there is no business need for immediate content activation than this is useful feature and it does improve replication performance.

 

Thanks

Dipti

Avatar

Level 2
Hi @Dipti_Chauhan thank for you reply. So could be good use batch mode with an activation tree of about 9000 nodes?

Avatar

Employee Advisor

@tommyc11112341 ,

 

enable replication batch mode on the author and replication tier to stagger performance impact using the threshold level defined as it would be the best approach in current scenerio.

 

Thanks

Avatar

Correct answer by
Employee Advisor

Hi @tommyc11112341!

First of all, the outlined behavior sounds somewhat abnormal and the root cause should be analyzed and addressed.

That being said, the two approaches to deal with the situation that you mention are:

  1. Enabling batch mode for replication
  2. Moving to a replication-tier architecture

Option 1 is a quick and low-effort approach and therefore I would recommend to give it a try although I'm not convinced that it will lead to a final resolution of the issue (it will probably bring some improvement, though). It also comes with a couple of implication, e. g. pages not being replicated immediately after a content author hits the "publish" button. Depending on the replication traffic, it may take some time until the replication is actually executed. These are implications that need to be discussed with the business side and aligned with according requirements of your project. For more information on batch replication, please refer to the documentation of this feature.

Option 2 is a commonly used pattern for high replication volumes and will definitely address the issue. So if you have high replication traffic, this is a totally valid option to evaluate.

 

Hope that helps!

 

 

Hi @markus_bulla_adobe thank a lot for your vision, in fact first of all I will investigate their currently instances because in my view there is an application issue and the behavior it's only a side effects.