Expand my Community achievements bar.

SOLVED

Inconsistency issue

Avatar

Former Community Member

We Are running our application in clustered environment and we are getting below exceptions in our application error log.Do we need to run a consistency check for our repository to fix this issue?Please any suggestion would help us to identify the action to be taken.

 

*ERROR* [pool-6-thread-4] org.apache.sling.discovery.impl.DiscoveryServiceImpl handleEvent: got a PersistenceException: org.apache.sling.api.resource.PersistenceException: Unable to commit changes to session. org.apache.sling.api.resource.PersistenceException: Unable to commit changes to session.
at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.commit(JcrResourceProvider.java:513)
at org.apache.sling.resourceresolver.impl.helper.ResourceResolverContext.commit(ResourceResolverContext.java:148)
at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.commit(ResourceResolverImpl.java:1090)
at org.apache.sling.discovery.impl.DiscoveryServiceImpl.doUpdateProperties(DiscoveryServiceImpl.java:376)
at org.apache.sling.discovery.impl.DiscoveryServiceImpl.updateProperties(DiscoveryServiceImpl.java:425)
at org.apache.sling.discovery.impl.common.heartbeat.HeartbeatHandler.issueHeartbeat(HeartbeatHandler.java:193)
at org.apache.sling.discovery.impl.common.heartbeat.HeartbeatHandler.run(HeartbeatHandler.java:150)
at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:56)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: javax.jcr.InvalidItemStateException: Unable to update a stale item: item.save()
at org.apache.jackrabbit.core.ItemSaveOperation.perform(ItemSaveOperation.java:262)
at org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:216)
at org.apache.jackrabbit.core.ItemImpl.perform(ItemImpl.java:91)
at org.apache.jackrabbit.core.ItemImpl.save(ItemImpl.java:329)
at org.apache.jackrabbit.core.session.SessionSaveOperation.perform(SessionSaveOperation.java:65)
at org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:216)
at org.apache.jackrabbit.core.SessionImpl.perform(SessionImpl.java:361)
at org.apache.jackrabbit.core.SessionImpl.save(SessionImpl.java:812)
at com.day.crx.core.CRXSessionImpl.save(CRXSessionImpl.java:142)
at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.commit(JcrResourceProvider.java:511)
... 11 more
Caused by: org.apache.jackrabbit.core.state.StaleItemStateException: a49dac9d-bb77-485c-b85e-8168c02e1cb8/{}job.consumermanager.whitelist has been modified externally
at org.apache.jackrabbit.core.state.SharedItemStateManager$Update.begin(SharedItemStateManager.java:679)
at org.apache.jackrabbit.core.state.SharedItemStateManager.beginUpdate(SharedItemStateManager.java:1507)
at org.apache.jackrabbit.core.state.SharedItemStateManager.update(SharedItemStateManager.java:1537)
at org.apache.jackrabbit.core.state.LocalItemStateManager.update(LocalItemStateManager.java:400)
at org.apache.jackrabbit.core.state.XAItemStateManager.update(XAItemStateManager.java:354)
at org.apache.jackrabbit.core.state.LocalItemStateManager.update(LocalItemStateManager.java:375)
at org.apache.jackrabbit.core.state.SessionItemStateManager.update(SessionItemStateManager.java:275)
at org.apache.jackrabbit.core.ItemSaveOperation.perform(ItemSaveOperation.java:258)
... 20 more
 *ERROR* [pool-6-thread-4] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.sling.discovery.impl.common.heartbeat.HeartbeatHandler@3835ed1f : Exception while talking to repository (org.apache.sling.api.resource.PersistenceException: Unable to commit changes to session.) java.lang.RuntimeException: Exception while talking to repository (org.apache.sling.api.resource.PersistenceException: Unable to commit changes to session.)
at org.apache.sling.discovery.impl.DiscoveryServiceImpl.doUpdateProperties(DiscoveryServiceImpl.java:384)
at org.apache.sling.discovery.impl.DiscoveryServiceImpl.updateProperties(DiscoveryServiceImpl.java:425)
at org.apache.sling.discovery.impl.common.heartbeat.HeartbeatHandler.issueHeartbeat(HeartbeatHandler.java:193)
at org.apache.sling.discovery.impl.common.heartbeat.HeartbeatHandler.run(HeartbeatHandler.java:150)
at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:56)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.sling.api.resource.PersistenceException: Unable to commit changes to session.
at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.commit(JcrResourceProvider.java:513)
at org.apache.sling.resourceresolver.impl.helper.ResourceResolverContext.commit(ResourceResolverContext.java:148)
at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.commit(ResourceResolverImpl.java:1090)
at org.apache.sling.discovery.impl.DiscoveryServiceImpl.doUpdateProperties(DiscoveryServiceImpl.java:376)
1 Accepted Solution

Avatar

Correct answer by
Level 10

There are two different frameworks that use the term, clustering, within CQ. There is the CQ clustering that makes all of the repositories of clustered nodes virtually equivalent. There is a Sling-based task sharing cooperation [1][2] between CQ nodes that is enabled between nodes of a CQ cluster by default. But the instances don't have to be clustered to use the task sharing cooperation.

Within the Sling task sharing group of CQ nodes, each CQ instance has a UUID. When more than one CQ instance has the same UUID as any other, this sort of error can occur. It is possible there is another cause, but this is the first thing to check.

Go to the Sling settings console within Felix [3] on each of your CQ instances. On that page you can see the value for "Sling ID." Confirm it is different on each of your CQ instances.

If it is identical, you will need to change it. Go to the [crx-default]/launchpad/felix directory for each CQ instance you need to change. Search for a file named, sling.id.file. This contains the UUID used for topology. Change it to be unique for each instance. Consider the UUID a hexadecimal number. On the master CQ node, that number needs to be less than the UUIDs for the other instances.

After you have changed the UUIDs, bounce all of the CQ instances.

It is the most likely cause for your problem.

 

[1] http://dev.day.com/docs/en/cq/current/deploying/offloading.html

[2] https://sling.apache.org/documentation/bundles/discovery-api-and-impl.html

[3] http://[host]:[port]/system/console/status-slingsettings

View solution in original post

1 Reply

Avatar

Correct answer by
Level 10

There are two different frameworks that use the term, clustering, within CQ. There is the CQ clustering that makes all of the repositories of clustered nodes virtually equivalent. There is a Sling-based task sharing cooperation [1][2] between CQ nodes that is enabled between nodes of a CQ cluster by default. But the instances don't have to be clustered to use the task sharing cooperation.

Within the Sling task sharing group of CQ nodes, each CQ instance has a UUID. When more than one CQ instance has the same UUID as any other, this sort of error can occur. It is possible there is another cause, but this is the first thing to check.

Go to the Sling settings console within Felix [3] on each of your CQ instances. On that page you can see the value for "Sling ID." Confirm it is different on each of your CQ instances.

If it is identical, you will need to change it. Go to the [crx-default]/launchpad/felix directory for each CQ instance you need to change. Search for a file named, sling.id.file. This contains the UUID used for topology. Change it to be unique for each instance. Consider the UUID a hexadecimal number. On the master CQ node, that number needs to be less than the UUIDs for the other instances.

After you have changed the UUIDs, bounce all of the CQ instances.

It is the most likely cause for your problem.

 

[1] http://dev.day.com/docs/en/cq/current/deploying/offloading.html

[2] https://sling.apache.org/documentation/bundles/discovery-api-and-impl.html

[3] http://[host]:[port]/system/console/status-slingsettings