Hi Experience League Community,
We’ve been observing a recurring issue where our publish instances experience outages (showing a whitepage with a 504 error), and during these events, telemetry data shows a noticeable spike and plateau in transactions related to startJob 10-15 minutes before the downtime. These transactions remain consistently high until the publishers go down, in which case, it will turn to 0.
Upon inspecting the logs of our publishers, the only traces that we could find related to startJob is its presence in the stack trace of an unclosed ResourceResolver warning:
*WARN* [Apache Sling Resource Resolver Finalizer Thread] org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl Closed unclosed ResourceResolver. The creation stacktrace is available on info log level. *INFO* [Apache Sling Resource Resolver Finalizer Thread] org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl Unclosed ResourceResolver was created here: java.lang.Exception: Opening Stacktrace at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl$ResolverReference.<init>(...) at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.register(...) at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(...) ... at com.day.cq.dam.usage.impl.listener.AssetUsageListener.process(...) at org.apache.sling.event.impl.jobs.JobConsumerManager$JobConsumerWrapper.process(...) at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.startJob(...) ...
We’re trying to understand:
Any insights or suggestions on how to further investigate this would be greatly appreciated.
Thanks in advance!
Views
Replies
Total Likes
HI @CharlesPa2 ,
In Adobe Experience Manager (AEM), starting a job transaction generally refers to leveraging the Sling Job Handling framework to execute asynchronous tasks.
These tasks, often referred to as Sling Jobs, are distinct from typical application transactions as they involve processing queued jobs using a dedicated system.
Can you check if you have implemented and custom sling jobs and in that check if you have any open resource resolver. Please try to close that once job is done.
You can use try-with-resource to do so.
try (ResourceResolver resourceResolver = resolverFactory.getServiceResourceResolver(Collections.singletonMap(ResourceResolverFactory.SUBSERVICE, "{service user name}"))) {
// Business logic using the resourceResolver
}
-Tarun
@CharlesPa2 just checking in! Were you able to get this resolved? If one of the replies above helped—whether it completely solved the issue or simply pointed you in the right direction—marking it as accepted can make it much easier for others with the same question to find a solution. And if you found a different way to fix it, sharing your approach would be a great contribution to the community. Your follow-up not only helps close the loop but also ensures others benefit from your experience. Thanks so much for being part of the conversation!
Views
Replies
Total Likes