Expand my Community achievements bar.

Fluctuating Performance after upgrade from AEM 6.1 to 6.2

Avatar

Level 3

We have an environment which was recently upgraded from AEM 6.1 to 6.2.  

Intermittently for periods of 15mins to an hour the server becomes nearly unresponsive.  During these periods there are no errors in the logs.

However, I've noticed warnings related to idle sessions popping up every 20-30 seconds which seems excessive.  Is this related?  Any other thoughts?

28.06.2016 15:33:40.203 *INFO* [10.240.1.3 [1467141961252] GET /mnt/overlay/granite/ui/content/shell/header/actions/pulse.data.json HTTP/1.1] com.adobe.granite.repository Service [22484, [org.apache.jackrabbit.oak.api.jmx.SessionMBean]] ServiceEvent UNREGISTERING 28.06.2016 15:33:40.204 *WARN* [10.240.1.3 [1467141961252] GET /mnt/overlay/granite/ui/content/shell/header/actions/pulse.data.json HTTP/1.1] org.apache.jackrabbit.oak.jcr.session.RefreshStrategy This session has been idle for 7 minutes and might be out of date. Consider using a fresh session or explicitly refresh the session. java.lang.Exception: The session was created here: at org.apache.jackrabbit.oak.jcr.session.RefreshStrategy$LogOnce.<init>(RefreshStrategy.java:169) at org.apache.jackrabbit.oak.jcr.repository.RepositoryImpl.login(RepositoryImpl.java:277) at com.adobe.granite.repository.impl.CRX3RepositoryImpl.login(CRX3RepositoryImpl.java:94) at org.apache.jackrabbit.oak.jcr.repository.RepositoryImpl.login(RepositoryImpl.java:219) at org.apache.sling.jcr.base.AbstractSlingRepository2.login(AbstractSlingRepository2.java:288) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrProviderStateFactory.createProviderState(JcrProviderStateFactory.java:121) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.authenticate(JcrResourceProvider.java:267) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.authenticate(JcrResourceProvider.java:78) at org.apache.sling.resourceresolver.impl.providers.stateful.ProviderManager.authenticate(ProviderManager.java:161) at org.apache.sling.resourceresolver.impl.providers.stateful.ProviderManager.getOrCreateProvider(ProviderManager.java:87) at org.apache.sling.resourceresolver.impl.providers.stateful.ProviderManager.authenticateAll(ProviderManager.java:129) at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.createControl(ResourceResolverImpl.java:154) at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:116) at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:110) at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getResourceResolverInternal(CommonResourceResolverFactoryImpl.java:257) at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getResourceResolver(CommonResourceResolverFactoryImpl.java:162) at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryImpl.getResourceResolver(ResourceResolverFactoryImpl.java:99) at org.apache.sling.auth.core.impl.SlingAuthenticator.getResolver(SlingAuthenticator.java:782) at org.apache.sling.auth.core.impl.SlingAuthenticator.doHandleSecurity(SlingAuthenticator.java:497) at org.apache.sling.auth.core.impl.SlingAuthenticator.handleSecurity(SlingAuthenticator.java:451) at org.apache.sling.engine.impl.SlingHttpContext.handleSecurity(SlingHttpContext.java:121) at org.apache.felix.http.base.internal.service.ServletContextImpl.handleSecurity(ServletContextImpl.java:421) at org.apache.felix.http.base.internal.dispatch.InvocationChain.doFilter(InvocationChain.java:57) at org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:124) at org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:61) at javax.servlet.http.HttpServlet.service(HttpServlet.java:725) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:499) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745)
8 Replies

Avatar

Level 2

Hi,

 

Did you find any solution?

Seems I face the same trouble... :)

Avatar

Employee

Hi,

With AEM 6.2 it automatically takes thread dumps and writes these to "crx-quickstart/threaddumps", you should analyse these or take thread dumps when your system starts slowing down.

Thread dumps should help you understand what is going on in your system and from here you can investigate whether the issue is with the OOTB or custom code.

Have you attached a profiler to the instance to see what how resources being consumed? You should do this monitoring at the OS level as well.

Regards,

Opkar

[0]https://docs.adobe.com/docs/en/aem/6-2/administer/operations/troubleshoot.html

Avatar

Level 2

Hi,

 

Thanks for the help.

The issue was coming from around 500 workflows that as been started and that crashed on the first step, so they were starting over and over again.

Seems that the loggin module was failing too.

 

Fixed!

 

Regards

Grégory

Avatar

Level 4

Sounds like backup tasks running.  Did you investigate the maintenance schedule configurations?  Which platform, *nix or Windows?  Have you monitored the server when this is happening?  Is it CPU saturation or network saturation?

Hope this helps...

Avatar

Level 1
        Can you please suggest how was this issue fixed for you. The steps you did, seem our environment is facing same issue Thanks In Advance

Avatar

Level 1
        Can you please suggest how was this issue fixed for you. The steps you did, seem our environment is facing same issue Thanks In Advance

Avatar

Level 2

we too have the same issue. how that could be resolved?

Avatar

Level 4

I would suggest you look at the JMX event log and see what is running.  This post says it was solved by fixing the workflow that was crashing on first step.  Those workflow activities are generally logged as events.  If you do some due-diligence on your event logs you might find the offending reason for your slow downs.

system/console/jmx/org.apache.jackrabbit.oak%3Aid%3D1%2Cname%3D"Consolidated+Event+Listener+statistics"%2Ctype%3D"ConsolidatedListenerStats"

system/console/jmx/org.apache.sling%3Aname%3DEvent+processing+pool%2Cservice%3DThreadPool%2Ctype%3Dthreads

HTH.