Expand my Community achievements bar.

SOLVED

UserSync socialpubsync distribution blocked by rep:cache

Avatar

Level 4

I've finally got UserSync working on the 3 publishers I have running on AEM 6.3.3.2, and things are mostly ok until the distribution queue for two of the publishes gets blocked by a node ending in rep:cache which is happening pretty regularly, even when user nodes aren't being updated. We're currently testing UserSync features so it's not available to our users but with as often as the distribution queue is getting blocked and requiring manual clearing, I don't think we can push it out.

I've followed the steps here -- User Synchronization  -- and under Vault Package Builder Factory, it says to add /home/users|-.*/rep:cache to the package filters. I'm not sure what it's for, but with the mention lower about /home/users|+.*/rep:policy being used to overwrite the existing nodes I figured the minus of rep:cache might prevent that from being passed. I'm not seeing any rep:cache nodes, so they're not replicating.

Can anyone help me figure out why the distribution queue is getting blocked and how I can prevent this so that syncing will actually work on it's own??

Thanks for anything.

1 Accepted Solution

Avatar

Correct answer by
Level 10

Check if the rep:cache is enabled in your case? Refer - Jackrabbit Oak – Caching Results of Principal Resolution

Since Oak 1.3.4 this UserPrincipalProvider optionally allows for temporary caching of the principal resolution mainly to optimize login performance (OAK-3003). An administrator may enable the group principal caching via the org.apache.jackrabbit.oak.security.user.UserConfigurationImpl OSGi configuration. By default caching is disabled.

Check Apache Jackrabbit Oak UserConfiguration > Principal Cache Expiration  in /system/console/configMgr

This filter in package definition would exclude the /rep:cache node to be synced even if it is enabled -

  • /home/users|-.*/rep:cache

Could you validate the configuration of Apache Sling Distribution Packaging - Vault Package Builder Factory

to ensure that package filter is configured correctly as mentioned in docs?

chlimage_1

View solution in original post

7 Replies

Avatar

Level 10

Hi - do you see any log messages that are related to this?

Avatar

Level 4

This is the error message that I've found for socialpubsync in the error log.

10.02.2019 04:06:37.635 *ERROR* [sling-threadpool-ff2d4e3a-30e4-4993-a713-996bebde4b64-(apache-sling-job-thread-pool)-3160-org_apache_sling_distribution_queue_socialpubsync_endpoint2(org/apache/sling/distribution/queue/socialpubsync/endpoint2)] org.apache.sling.distribution.agent.impl.SimpleDistributionAgent [agent][socialpubsync] [endpoint2] PACKAGE-FAIL DSTRQ6628: could not deliver package dstrpck-2019--2--9--20--32--7f34dbd3-3b25-4c94-bf66-793b24ef98a6_388 org.apache.http.client.HttpResponseException: Server Error

org.apache.sling.distribution.common.DistributionException: org.apache.http.client.HttpResponseException: Server Error

at org.apache.sling.distribution.transport.impl.SimpleHttpDistributionTransport.deliverPackage(SimpleHttpDistributionTransport.java:161)

at org.apache.sling.distribution.packaging.impl.importer.RemoteDistributionPackageImporter.importPackage(RemoteDistributionPackageImporter.java:67)

at org.apache.sling.distribution.agent.impl.SimpleDistributionAgentQueueProcessor.processQueueItem(SimpleDistributionAgentQueueProcessor.java:135)

at org.apache.sling.distribution.agent.impl.SimpleDistributionAgentQueueProcessor.process(SimpleDistributionAgentQueueProcessor.java:92)

at org.apache.sling.distribution.queue.impl.jobhandling.DistributionAgentJobConsumer.process(DistributionAgentJobConsumer.java:49)

at org.apache.sling.event.impl.jobs.JobConsumerManager$JobConsumerWrapper.process(JobConsumerManager.java:502)

at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.startJob(JobQueueImpl.java:293)

at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.access$100(JobQueueImpl.java:60)

at org.apache.sling.event.impl.jobs.queues.JobQueueImpl$1.run(JobQueueImpl.java:229)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.http.client.HttpResponseException: Server Error

at org.apache.http.impl.client.AbstractResponseHandler.handleResponse(AbstractResponseHandler.java:70)

at org.apache.http.client.fluent.Response.handleResponse(Response.java:90)

at org.apache.http.client.fluent.Response.returnContent(Response.java:97)

at org.apache.sling.distribution.transport.impl.SimpleHttpDistributionTransport.deliverPackage(SimpleHttpDistributionTransport.java:147)

... 11 common frames omitted

Here are the screenshots from the distribution queue. I had to wait for it to happen again to get them.

distribution_queue1.PNG

distribution_queue2.PNG

Avatar

Correct answer by
Level 10

Check if the rep:cache is enabled in your case? Refer - Jackrabbit Oak – Caching Results of Principal Resolution

Since Oak 1.3.4 this UserPrincipalProvider optionally allows for temporary caching of the principal resolution mainly to optimize login performance (OAK-3003). An administrator may enable the group principal caching via the org.apache.jackrabbit.oak.security.user.UserConfigurationImpl OSGi configuration. By default caching is disabled.

Check Apache Jackrabbit Oak UserConfiguration > Principal Cache Expiration  in /system/console/configMgr

This filter in package definition would exclude the /rep:cache node to be synced even if it is enabled -

  • /home/users|-.*/rep:cache

Could you validate the configuration of Apache Sling Distribution Packaging - Vault Package Builder Factory

to ensure that package filter is configured correctly as mentioned in docs?

chlimage_1

Avatar

Level 4

Principal Cache Expiration is 0 on author, but it is set to 30000 on all 3 of the publish instances. Is that creating the problem? Should that be 0 instead?

For the Vault Package Builder Factory, the screenshot on the User Sync setup page doesn't match the options that I have, but I'm pretty sure it's correct.

1689283_pastedImage_0.png

Thanks!

Avatar

Level 10

If Principal cache expiration is enabled to 3000, then it would create rep:cache folders on publish servers. You can find out the reason of enabling it within your team and if it should remain enabled on publish servers. The primary reason to enable it is to allow for temporary caching of the principal resolution mainly to optimize login performance (assuming that you have such use case).

What version of AEM do you use?

Try to remove these "rep:cache" configuration and test if it works fine? The article below mentions to add these Vault configs only if you get specific error -  "OakConstraint0034: Attempt to create or change the system maintained cache."

Source - AEM Communities User Sync stops working

Avatar

Level 4

I'm not sure why our publishers have the rep:cache set as that's not something we did intentionally. I've set this to 0 on all 3 and will monitor through the day to see if this solves the issue.

Avatar

Level 4

After changing Principal Cache Expiration to 0 on the publishers, everything is running smoothly. Thanks for your help!