I've finally got UserSync working on the 3 publishers I have running on AEM 6.3.3.2, and things are mostly ok until the distribution queue for two of the publishes gets blocked by a node ending in rep:cache which is happening pretty regularly, even when user nodes aren't being updated. We're currently testing UserSync features so it's not available to our users but with as often as the distribution queue is getting blocked and requiring manual clearing, I don't think we can push it out.
I've followed the steps here -- User Synchronization -- and under Vault Package Builder Factory, it says to add /home/users|-.*/rep:cache to the package filters. I'm not sure what it's for, but with the mention lower about /home/users|+.*/rep:policy being used to overwrite the existing nodes I figured the minus of rep:cache might prevent that from being passed. I'm not seeing any rep:cache nodes, so they're not replicating.
Can anyone help me figure out why the distribution queue is getting blocked and how I can prevent this so that syncing will actually work on it's own??
Thanks for anything.
Solved! Go to Solution.
Check if the rep:cache is enabled in your case? Refer - Jackrabbit Oak – Caching Results of Principal Resolution
Since Oak 1.3.4 this UserPrincipalProvider optionally allows for temporary caching of the principal resolution mainly to optimize login performance (OAK-3003). An administrator may enable the group principal caching via the org.apache.jackrabbit.oak.security.user.UserConfigurationImpl OSGi configuration. By default caching is disabled.
Check Apache Jackrabbit Oak UserConfiguration > Principal Cache Expiration in /system/console/configMgr
This filter in package definition would exclude the /rep:cache node to be synced even if it is enabled -
Could you validate the configuration of Apache Sling Distribution Packaging - Vault Package Builder Factory
to ensure that package filter is configured correctly as mentioned in docs?
Views
Replies
Total Likes
Hi - do you see any log messages that are related to this?
Views
Replies
Total Likes
This is the error message that I've found for socialpubsync in the error log.
10.02.2019 04:06:37.635 *ERROR* [sling-threadpool-ff2d4e3a-30e4-4993-a713-996bebde4b64-(apache-sling-job-thread-pool)-3160-org_apache_sling_distribution_queue_socialpubsync_endpoint2(org/apache/sling/distribution/queue/socialpubsync/endpoint2)] org.apache.sling.distribution.agent.impl.SimpleDistributionAgent [agent][socialpubsync] [endpoint2] PACKAGE-FAIL DSTRQ6628: could not deliver package dstrpck-2019--2--9--20--32--7f34dbd3-3b25-4c94-bf66-793b24ef98a6_388 org.apache.http.client.HttpResponseException: Server Error
org.apache.sling.distribution.common.DistributionException: org.apache.http.client.HttpResponseException: Server Error
at org.apache.sling.distribution.transport.impl.SimpleHttpDistributionTransport.deliverPackage(SimpleHttpDistributionTransport.java:161)
at org.apache.sling.distribution.packaging.impl.importer.RemoteDistributionPackageImporter.importPackage(RemoteDistributionPackageImporter.java:67)
at org.apache.sling.distribution.agent.impl.SimpleDistributionAgentQueueProcessor.processQueueItem(SimpleDistributionAgentQueueProcessor.java:135)
at org.apache.sling.distribution.agent.impl.SimpleDistributionAgentQueueProcessor.process(SimpleDistributionAgentQueueProcessor.java:92)
at org.apache.sling.distribution.queue.impl.jobhandling.DistributionAgentJobConsumer.process(DistributionAgentJobConsumer.java:49)
at org.apache.sling.event.impl.jobs.JobConsumerManager$JobConsumerWrapper.process(JobConsumerManager.java:502)
at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.startJob(JobQueueImpl.java:293)
at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.access$100(JobQueueImpl.java:60)
at org.apache.sling.event.impl.jobs.queues.JobQueueImpl$1.run(JobQueueImpl.java:229)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.http.client.HttpResponseException: Server Error
at org.apache.http.impl.client.AbstractResponseHandler.handleResponse(AbstractResponseHandler.java:70)
at org.apache.http.client.fluent.Response.handleResponse(Response.java:90)
at org.apache.http.client.fluent.Response.returnContent(Response.java:97)
at org.apache.sling.distribution.transport.impl.SimpleHttpDistributionTransport.deliverPackage(SimpleHttpDistributionTransport.java:147)
... 11 common frames omitted
Here are the screenshots from the distribution queue. I had to wait for it to happen again to get them.
Views
Replies
Total Likes
Check if the rep:cache is enabled in your case? Refer - Jackrabbit Oak – Caching Results of Principal Resolution
Since Oak 1.3.4 this UserPrincipalProvider optionally allows for temporary caching of the principal resolution mainly to optimize login performance (OAK-3003). An administrator may enable the group principal caching via the org.apache.jackrabbit.oak.security.user.UserConfigurationImpl OSGi configuration. By default caching is disabled.
Check Apache Jackrabbit Oak UserConfiguration > Principal Cache Expiration in /system/console/configMgr
This filter in package definition would exclude the /rep:cache node to be synced even if it is enabled -
Could you validate the configuration of Apache Sling Distribution Packaging - Vault Package Builder Factory
to ensure that package filter is configured correctly as mentioned in docs?
Views
Replies
Total Likes
Principal Cache Expiration is 0 on author, but it is set to 30000 on all 3 of the publish instances. Is that creating the problem? Should that be 0 instead?
For the Vault Package Builder Factory, the screenshot on the User Sync setup page doesn't match the options that I have, but I'm pretty sure it's correct.
Thanks!
Views
Replies
Total Likes
If Principal cache expiration is enabled to 3000, then it would create rep:cache folders on publish servers. You can find out the reason of enabling it within your team and if it should remain enabled on publish servers. The primary reason to enable it is to allow for temporary caching of the principal resolution mainly to optimize login performance (assuming that you have such use case).
What version of AEM do you use?
Try to remove these "rep:cache" configuration and test if it works fine? The article below mentions to add these Vault configs only if you get specific error - "OakConstraint0034: Attempt to create or change the system maintained cache."
Source - AEM Communities User Sync stops working
Views
Replies
Total Likes
I'm not sure why our publishers have the rep:cache set as that's not something we did intentionally. I've set this to 0 on all 3 and will monitor through the day to see if this solves the issue.
Views
Replies
Total Likes
After changing Principal Cache Expiration to 0 on the publishers, everything is running smoothly. Thanks for your help!