We're newly launched to production and we're noticing a problem that we've not seen before. Our replication agents (author-to-author, author-to-publisher) get stuck without reason holding items in the queue and we see no errors in the replication logs or in the error logs in both publishers (if the queue gets stuck during replication from publisher-to-publisher for example). As soon as we do Refresh of the 'Adobe Granite Replication' bundle (adobe.cq.replication), we immediately see the queue dissipate. We do have patch cq-5.6.1-hotfix-4101-1.0.zip installed (OutOfMemoryError on publish during replication of much data). Selecting the top most item in the queue and doing a Force Retry does nothing. We see no information in the replication agent logs or error log.
Why is it that a bundle refresh gets the queue moving again?
Yes thanks - our replication agents are enabled and configured correctly as mentioned in that link. However mid-way through replications, the queue gets stuck. So say 5000 items in queue, for no reason when it's down to say 2100, the queue gets stuck and we see nothing in the logs; no errors, and no information when we force retry an entry in the queue. At this point doing a fresh of the bundle, gets the queue moving again. Nothing else does it.