AEM 6.4.8 en-us content publish resulted in contents deleted from a geo site by msm-service

Avatar

Avatar
Validate 1
Level 2
akashdeepAEM
Level 2

Likes

9 likes

Total Posts

21 posts

Correct reply

4 solutions
Top badges earned
Validate 1
Give Back 3
Give Back
Ignite 1
Boost 5
View profile

Avatar
Validate 1
Level 2
akashdeepAEM
Level 2

Likes

9 likes

Total Posts

21 posts

Correct reply

4 solutions
Top badges earned
Validate 1
Give Back 3
Give Back
Ignite 1
Boost 5
View profile
akashdeepAEM
Level 2

09-07-2021

Hi,

cc @kautuk_sahni @Jörg_Hoh  @berliant  @jbrar @Arun_Patidar 

We noticed on content (/content/proj/en-us/prod/accessories ) replication resulted in (/content/proj/ja-jp/prod/accessories) delete by msm-service from one of the many geo sites.

This section of content (/prod/accessories ) was only deleted from one of the geo site (jp-ja) and
this geo site (jp-ja) had pages which donot exists on en-us(master) but specific to this geo (jp-ja). 

We suspect that Rollout Manager tried to resolve conflict which resulted in this content delete.
as there was a com.day.cq.wcm.api.WCMException: javax.jcr.InvalidItemStateException during that time. 



... 1 line omitted ... com.day.cq.wcm.api.WCMException: javax.jcr.InvalidItemStateException: OakState0001: Unresolved conflicts in /content/proj/en-in/products/

accessories/jcr:content at com.day.cq.wcm.msm.impl.RolloutManagerImpl.save(RolloutManagerImpl.java:944) [com.day.cq.wcm.cq-msm-core:5.11.84] at com.day.cq.wcm.msm.impl.RolloutManagerImpl.save(RolloutManagerImpl.java:928) [com.day.cq.wcm.cq-msm-core:5.11.84] at com.day.cq.wcm.msm.impl.RolloutManagerImpl.rolloutPageRelations(RolloutManagerImpl.java:627) [com.day.cq.wcm.cq-msm-core:5.11.84] at com.day.cq.wcm.msm.impl.RolloutManagerImpl.rolloutRelations(RolloutManagerImpl.java:869) [com.day.cq.wcm.cq-msm-core:5.11.84]


Is there a recommended way to avoid this deletion?
what is the recommendation on configuring this Day CQ WCM Rollout Manager in a similar use case.


Do you think, we should increase Background thread pool size from current 5 to count of geos (25),
and increase the current max shut down time from current 10 mins to to some higher value like 30 mins ?

akashdeepAEM_0-1625848594192.png

Thank You!

 

Accepted Solutions (1)

Accepted Solutions (1)

Avatar

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,134 likes

Total Posts

3,161 posts

Correct reply

1,079 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,134 likes

Total Posts

3,161 posts

Correct reply

1,079 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile
Jörg_Hoh
Employee

18-07-2021

I don't think that the exception has something to do with the problem you describe. Instead it could be that this exception caused the termination of a larger operation, which would have deleted much more content.

 

What is the relation between /content/proj/en-us and /content/proj/ja-jp? Are they both livecopies from a /content/project/master tree? Or is one path the livecopy of the other path?

What Rollout Configuration have been used in the rollout?

akashdeepAEM
akashdeepAEM

 /content/proj/en-us is the cq:master for /content/proj/ja-jp .
One path is the liveCopy for other path.

MSM rollout triggers include: publish, modification, rollout which invokes below liveActions 
ootb liveActions:- ContentUpdateAction  , ContentDeleteAction 
custom liveActions:- updateModifiedContentToAllGeoSites , publishNewChangesInMaster

Thanks

Jörg_Hoh

Thanks for the update. So if I understand you correctly, you activated a page in the master at /content/proj/en-us area, and as a consequence the matching page in the livecopy at /content/proj/ja-jp has been deleted. Is my understanding correct?

 

I am not aware of such a behavior in any of the ootb actions, and I see that you have custom actions defined. Can you check these custom actions if it's possible that it (directly or indirectly) cause a deletion?

Another thing you could check: You mentioned an exception you have found in the logs. Can you check the stacktrace and see if rollout actions (or more generally spoken: MSM related class/package names) appear in the stacktrace? In that case you are able to confirm that this exception indeed belongs to the problem, and it can also pinpoint you to a code path, which is causing a write action.

On the other hand side, it might be valid write action to the repo, which conflicts with a delete operation, for which we don't have any clue where it is coming from.

 

I think that you have a realistic chance to find out what caused this situation, but this requires some work on your side. If we can help you with some details, just keep on posting in this thread.

 

Jörg

 

Answers (1)

Answers (1)

Avatar

Avatar
Springboard
MVP
Shashi_Mulugu
MVP

Likes

232 likes

Total Posts

294 posts

Correct reply

67 solutions
Top badges earned
Springboard
Bedrock
Validate 1
Applaud 100
Establish
View profile

Avatar
Springboard
MVP
Shashi_Mulugu
MVP

Likes

232 likes

Total Posts

294 posts

Correct reply

67 solutions
Top badges earned
Springboard
Bedrock
Validate 1
Applaud 100
Establish
View profile
Shashi_Mulugu
MVP

13-07-2021

@akashdeepAEM So have you configured rollout on Publish/Activation.? Have you tried setting up "CQ MSM Content Delete Action" excluding nodetypes ?