Jdruwe
Jdruwe
21-01-2019
I've updated the trigger for the invalidation agent but it still seems to trigger a replication for some reason. If it was our own code wouldn't I be able to see it while debugging?
When I set the logging level for the agent to info I can see the following: https://pastebin.com/raw/kc6tVY4r On line 14 you can see 'com.day.cq.replication.impl.ReplicatorImpl' setting up a replication with no options for some reason:
*INFO* [Thread-1352] com.day.cq.replication.impl.ReplicatorImpl Setting up replication with options: ReplicationOptions{synchronous=false, revision='null', suppressStatusUpdate=false, suppressVersions=false, filter=null, aggregateHandler=null}
I see no custom code intervening whatsoever.
UPDATE: I tested something on 2 different livecopies. On LiveCopy A and B I had 3 rollout configurations set:
- /etc/msm/rolloutconfigs/pushonmodify
- /etc/msm/rolloutconfigs/activate
- /etc/msm/rolloutconfigs/deactivate
When I removed '/etc/msm/rolloutconfigs/activate' from LiveCopy A, only LiveCopy B got automatically published when modifying the page. Ofcourse we need the '/etc/msm/rolloutconfigs/activate' to be there for every LiveCopy because we don't want our authors to publish each LiveCopy page individually. This really seems to be an AEM related bug.
Gaurav-Behl
MVP
Gaurav-Behl
MVP
21-01-2019
I'm able to reproduce the behavior you mentioned only if I turn on "On modification" checkbox of either publish agent or flush/invalidation agent and this makes sense because 'pushonmodify' would trigger 'onModify' event which these agents would capture and trigger the replication & flush accordingly.
If 'on modification' is left unchecked on both publish agent and invalidation agent, I do not see any publish activity with either of 'pushonmodify' or 'activate' rollouts unless the author publishes the content manually. Even after removing 'pushonmodify' and doing a manual rollout doesn't trigger publish activity.
couple of questions/tasks:
reference - MSM Best Practices,
When using the rollout trigger onModify you should consider that:
Therefore, it is recommended that you only use onModify triggers if the benefits of automatic rollout initiation outweigh any potential performance issues.
Jörg_Hoh
Employee
Jörg_Hoh
Employee
21-01-2019
Strange. I see that the thread "ReplicateOnModification Processor" performs a replication with options and a filter set. That looks good. On the other hand, in the line 14 a replication is setup without any filter (as you mentioned), but this is done by "thread-1352". In AEM and Sling I would expect that the product uses threadpools to manage such threads, a unmanaged thread looks a bit suspicious to me. Can you try to repeat your scenario and do threaddumps to get a full stacktrace? That can gives us further indication about the root cause of it.
Jdruwe
Jdruwe
22-01-2019
How have you configured the flush agent to be triggered automatically on modification as you have mentioned in the description? Is it via "On modification" checkbox under "Triggers" tab or something else?
The dispatcher flush agent is the only agent that has the trigger 'On Modification' checked on its triggers tab.
Do you have "On modification" checkbox turned on under "Triggers" tab for default publish agent?
The publish agent has no triggers checked on its triggers tab.
Could you disable both default publish agent and dispatcher flush agent and check the blocked queues for both agents http://localhost:4502/etc/replication/agents.author/flush.html and http://localhost:4502/etc/replication/agents.author/publish.html
Publish agent
Just to be clear - If you remove/disable the flush/invalidation agent then you do not observe this kind of auto-replication behavior? You could validate same with above mentioned step of checking publish agent queue/logs.
It seems that when the dispatcher flush agent is disabled I indeed don't see the auto-replication behaviour on modification only when I actually publish the page the corresponding item will be added to the publish agent queue.
Could you remove 'pushonmodify' from live copies and do a manual rollout from source page and check the behavior if automated publish still happens?
When I replace my LiveCopy rollout configuration:
/etc/msm/rolloutconfigs/activate
/etc/msm/rolloutconfigs/deactivate
/etc/msm/rolloutconfigs/pushonmodify
BY:
/etc/msm/rolloutconfigs/activate
/etc/msm/rolloutconfigs/deactivate
/etc/msm/rolloutconfigs/default
And trigger a manual rollout using http://localhost:4502/etc/blueprints.html , the auto-replication behaviour is NOT happening anymore. We benefit a lot from the '/etc/msm/rolloutconfigs/pushonmodify' config, we are aware of the potential negative performance impact.
Jdruwe
Jdruwe
22-01-2019
When I navigate to http://localhost:4502/system/console/status-jstack-threaddump and repeat the scenario and look for 'Thread-1349' I find the following:
"Thread-1349" #3948 daemon prio=1 os_prio=31 tid=0x00007fa7c8aeb000 nid=0x16107 waiting on condition [0x00007000112a1000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000074d6967e8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I am not sure if this is of any use.
Gaurav-Behl
MVP
Gaurav-Behl
MVP
22-01-2019
Editing a component -
pushonmodify rollout pushed the content changes from source to live copies and then generated 'onmodify' event which was captured by flush agent and hence you see two invalidation requests with 'replication-service' triggered by flush agent.
I'm not sure why 'webservice-support-replication' got triggered for the child pages. Here's the explanation & fix for same [1]. Now 'msm-service' publish and invalidation request got triggered from the 'onmodify' trigger that was launched by 'pushonmodify' rollout. I believe this is what you mentioned as the problem statement. I'm not sure if this is a bug or expected functionality.
On publish-
Since admin user triggered the publish hence the first row is fine. In this case 'webservice-support-replication' [1] still got triggered. The rows with 'msm-service' are fine as it was a publish + invalidation request.
[1] - Flush replications are triggered by webservice-support-replication user | AEM 6.x
Per my knowledge, the common listener for publish agent and invalidation agent doesn't distinguish the action to be taken if the filter agent is null (which is the case here for onmodify event). Probably Jörg Hoh can help here..
The stack trace for 'Thread-1349' might be waiting on the agent itself because the queue was blocked.
Jörg_Hoh
Employee
Jörg_Hoh
Employee
22-01-2019
That's the stacktrace of a unused thread in a threadpool. For me it looks like that this threadpool is not created using sling mechanics, otherwise the threadname would be more meaningful (something like "pool-20-thread-7") or even speaking names ("oak-observation-1"). Are you sure that this thread is the one in question?
On the other hand side I see a "thread-25" here on my local AEM 6.4 instance which has the same stacktrace. That means, it can still be part of the product. Let me investigate on that.
Jdruwe
Jdruwe
23-01-2019
"Now 'msm-service' publish and invalidation request got triggered from the 'onmodify' trigger that was launched by 'pushonmodify' rollout. I believe this is what you mentioned as the problem statement. I'm not sure if this is a bug or expected functionality." Indeed I am not sure as well :S
Jdruwe
Jdruwe
23-01-2019
I investigated the empty ReplicationOptions thread using VisualVM:
[Thread-3788] com.day.cq.replication.impl.ReplicatorImpl Setting up replication with options:
ReplicationOptions{synchronous=false, revision='null', suppressStatusUpdate=false, suppressVersions=false, filter=null, aggregateHandler=null}
I see no custom code interfering, it seems like it's all part of the product.
Jdruwe
Jdruwe
28-01-2019
Do you have any idea Jörg?