Level 2

Getting Failed to create checkpoint message with AsyncIndexUpdate

Forum|Forum|5 years ago
October 14, 2020
2 replies
7986 views

Hi,

We are getting 504 Gateway Timeout server error when trying to access AEM author instance in our stage author environment. In error logs we see following "Failed to create checkpoint warning messages when this happens:

14.10.2020 04:01:01.940 *WARN* [sling-oak-54-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint 8a02222a-3c8d-44dc-8066-05e78ea31fad in 10 seconds.
14.10.2020 04:01:01.940 *WARN* [sling-oak-53-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-fulltext-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint a330f028-13fa-4c1f-9430-fb98fdc84cbd in 10 seconds.
14.10.2020 04:01:16.929 *WARN* [sling-oak-57-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint bc3de514-3188-4383-b138-c205a81bac68 in 10 seconds.
14.10.2020 04:01:16.938 *WARN* [sling-oak-52-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-fulltext-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint 13cc8004-2cc6-4c7e-8abb-4d579670c510 in 10 seconds.
14.10.2020 04:01:31.932 *WARN* [sling-oak-54-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-fulltext-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint ffd8d722-61ef-49a9-bde5-54d43ebc7663 in 10 seconds.
14.10.2020 04:01:31.941 *WARN* [sling-oak-58-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint 2d8be0dc-8fca-4a04-97d2-d0d4646cd87d in 10 seconds.
14.10.2020 04:01:46.924 *WARN* [sling-oak-53-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint 54e30c6c-3ee3-401e-a8a3-75949436449b in 10 seconds.
14.10.2020 04:01:46.929 *WARN* [sling-oak-57-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-fulltext-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint 1903c0fe-9f4a-47e9-b1d6-5328f9d85750 in 10 seconds.
14.10.2020 04:02:01.940 *WARN* [sling-oak-52-org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate-async] org.apache.jackrabbit.oak.segment.scheduler.LockBasedScheduler Failed to create checkpoint ed77ef61-3bac-4d59-b98a-4a98108abb3a in 10 seconds.

Server starts responding after a while and these warning messages also stops getting logged once server starts responding back. Last time it happened for nearly 100 minutes.

Any guidance here will be really helpful to understand the issue.

Thanks,

Bhawesh

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

H

harish_malineni

Level 2

Hi Bhawesh,
We faced the same issue earlier, Adobe recommended us to add JVM init argument -Doak.segmentNodeStore.commitFairLock=true and to restart the cq5 service. It reduced the error rate but does not solve the issue completely.

S

Shashi_Kant

Adobe Employee

The error just shows that during that time some other thread was holding the lock and as a result a checkpoint couldn't be created. Checkpoints should have been created at other times. You can perform routine checks (oak-run check) to confirm that checkpoints are indeed being created and there are good revisions available to which repo can be reverted to in times of crisis. Please add JVM parameter -Doak.segmentNodeStore.commitFairLock=true and restart AEM this should help resolve the issue

B

bhawesh-dandonaAuthor

Level 2

What we noticed is, it happens when the repository is getting locked to do any write operations, even users can't login in the system. AMS is pointing to issues related to indexing and there are few defects in the oak version we are using with 6.4.2.0. They asked us to upgrade to at least 6.4.8.2. We are anyway upgrading to 6.5 now. Let's hope if it resolves the issue..

K

khannapiyush36

Level 2

I know it's an old issue. But were you able to resolve this issue post upgrade. Because we are currently on 6.5.14 & still facing this and our AEM instance is getting slow down.

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded