Expand my Community achievements bar.

SOLVED

AEM 6.3 upgrade failed due to ClusterNodeInfo null pointer Exception

Avatar

Level 2

I am trying to upgrade AEM 6.1 to AEM 6.3 cluster instance. The in place upgrade is failed due to below error.

15.11.2017 11:50:36.704 *ERROR* [DocumentNodeStore background update thread (8)] org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore. (leaseEndTime: 1510746754503, leaseTime: 120000, leaseFailureMargin: 20000, lease check end time (leaseEndTime-leaseFailureMargin): 1510746734503, now: 1510746635595, remaining: 98908) Need to stop oak-core/DocumentNodeStoreService.

15.11.2017 11:54:21.978 *ERROR* [DocumentNodeStore background update thread (8)] org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.

15.11.2017 11:54:51.803 *WARN* [DocumentNodeStore lease update thread (8)] org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore Background operation failed: org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.

org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: This oak instance failed to update the lease in time and can therefore no longer access this DocumentNodeStore.

After this exception Java java.lang.OutOfMemoryError also coming.

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

The OOM indicates that you have a memory problem.

When the heap is nearly exhausted, the JVM tries to free up heap very agressivly, and it's therefor doing full garbage collections very often. A full garbage collection stops the complete JVM while it is running. Now, if you have many threads running as well, the Oak lease upgrade thread wants to run and access the repository, but it's failing to do that in time.

Enable garbage collection logging and check your heap calculation. You either have a memory leak, or your memory sizing is incorrect, or you just might encounter a situation of high memory usage. You should investigate that.

Jörg

View solution in original post

4 Replies

Avatar

Administrator

Hi

There are many reasons why one particular instance would not update a lease in time:

1. Typical due to cloud infrastructure (many firewall rules) and developer disk memory constraint takes more than default configured 30 seconds to renew.

2. Can't talk to backend mongodb/RDP

3. The memory is very low, thus very long GC cycles, preventing much from happening in the VM within 30 second to renew.

4. Some wrong code in BackgroundLeaseUpdate implementation

Try Following:

1. Adjust lease timeout (This does not worked out for me )

2. Disable Lease Check by adding -Doak.documentMK.disableLeaseCheck=true to startup script. IMO recommend this for lower envirnoment since all activities are trial.

3. For production and pre prod get TCP dump in wireshack of both aem and backend. Then arrive for optimal value analyzing the tcp dump

Jörg Hoh​ Can you please further help here.



Kautuk Sahni

Avatar

Correct answer by
Employee Advisor

The OOM indicates that you have a memory problem.

When the heap is nearly exhausted, the JVM tries to free up heap very agressivly, and it's therefor doing full garbage collections very often. A full garbage collection stops the complete JVM while it is running. Now, if you have many threads running as well, the Oak lease upgrade thread wants to run and access the repository, but it's failing to do that in time.

Enable garbage collection logging and check your heap calculation. You either have a memory leak, or your memory sizing is incorrect, or you just might encounter a situation of high memory usage. You should investigate that.

Jörg

Avatar

Level 2

Thanks Jorg, kautuksahni for your answer. After using -Doak.documentMK.disableLeaseCheck=true upgrade is completed. Not able to see java.lang.OutOfMemoryError exception. But still workflow migration step got failed due to the below error.

Is this below error also insufficient heap memory problem?  

  • *ERROR* [com.adobe.granite.workflow.upgrade.WorkflowPreserveContentHook] WorkflowPreserveContentHook.java:159  Error in workflow content preserve hook
  • java.lang.IllegalStateException: Branch with failed reset
  • Caused by: org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: No space left on device.

Avatar

Employee Advisor

Nope.

> Caused by: org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: No space left on device.

This means that your disk/partition is full.

Jörg