Fluctuating Performance after upgrade from AEM 6.1 to 6.2

Avatar

Avatar
Validate 10
Level 2
Rob_Curran__T4G
Level 2

Likes

6 likes

Total Posts

46 posts

Correct reply

3 solutions
Top badges earned
Validate 10
Validate 1
Boost 5
Boost 3
Boost 1
View profile

Avatar
Validate 10
Level 2
Rob_Curran__T4G
Level 2

Likes

6 likes

Total Posts

46 posts

Correct reply

3 solutions
Top badges earned
Validate 10
Validate 1
Boost 5
Boost 3
Boost 1
View profile
Rob_Curran__T4G
Level 2

29-06-2016

We have an environment which was recently upgraded from AEM 6.1 to 6.2.  

Intermittently for periods of 15mins to an hour the server becomes nearly unresponsive.  During these periods there are no errors in the logs.

However, I've noticed warnings related to idle sessions popping up every 20-30 seconds which seems excessive.  Is this related?  Any other thoughts?

28.06.2016 15:33:40.203 *INFO* [10.240.1.3 [1467141961252] GET /mnt/overlay/granite/ui/content/shell/header/actions/pulse.data.json HTTP/1.1] com.adobe.granite.repository Service [22484, [org.apache.jackrabbit.oak.api.jmx.SessionMBean]] ServiceEvent UNREGISTERING 28.06.2016 15:33:40.204 *WARN* [10.240.1.3 [1467141961252] GET /mnt/overlay/granite/ui/content/shell/header/actions/pulse.data.json HTTP/1.1] org.apache.jackrabbit.oak.jcr.session.RefreshStrategy This session has been idle for 7 minutes and might be out of date. Consider using a fresh session or explicitly refresh the session. java.lang.Exception: The session was created here: at org.apache.jackrabbit.oak.jcr.session.RefreshStrategy$LogOnce.<init>(RefreshStrategy.java:169) at org.apache.jackrabbit.oak.jcr.repository.RepositoryImpl.login(RepositoryImpl.java:277) at com.adobe.granite.repository.impl.CRX3RepositoryImpl.login(CRX3RepositoryImpl.java:94) at org.apache.jackrabbit.oak.jcr.repository.RepositoryImpl.login(RepositoryImpl.java:219) at org.apache.sling.jcr.base.AbstractSlingRepository2.login(AbstractSlingRepository2.java:288) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrProviderStateFactory.createProviderState(JcrProviderStateFactory.java:121) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.authenticate(JcrResourceProvider.java:267) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.authenticate(JcrResourceProvider.java:78) at org.apache.sling.resourceresolver.impl.providers.stateful.ProviderManager.authenticate(ProviderManager.java:161) at org.apache.sling.resourceresolver.impl.providers.stateful.ProviderManager.getOrCreateProvider(ProviderManager.java:87) at org.apache.sling.resourceresolver.impl.providers.stateful.ProviderManager.authenticateAll(ProviderManager.java:129) at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.createControl(ResourceResolverImpl.java:154) at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:116) at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:110) at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getResourceResolverInternal(CommonResourceResolverFactoryImpl.java:257) at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getResourceResolver(CommonResourceResolverFactoryImpl.java:162) at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryImpl.getResourceResolver(ResourceResolverFactoryImpl.java:99) at org.apache.sling.auth.core.impl.SlingAuthenticator.getResolver(SlingAuthenticator.java:782) at org.apache.sling.auth.core.impl.SlingAuthenticator.doHandleSecurity(SlingAuthenticator.java:497) at org.apache.sling.auth.core.impl.SlingAuthenticator.handleSecurity(SlingAuthenticator.java:451) at org.apache.sling.engine.impl.SlingHttpContext.handleSecurity(SlingHttpContext.java:121) at org.apache.felix.http.base.internal.service.ServletContextImpl.handleSecurity(ServletContextImpl.java:421) at org.apache.felix.http.base.internal.dispatch.InvocationChain.doFilter(InvocationChain.java:57) at org.apache.felix.http.base.internal.dispatch.Dispatcher.dispatch(Dispatcher.java:124) at org.apache.felix.http.base.internal.DispatcherServlet.service(DispatcherServlet.java:61) at javax.servlet.http.HttpServlet.service(HttpServlet.java:725) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:499) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745)

Replies

Avatar

Avatar
Validate 1
Level 2
gregoryp3536631
Level 2

Likes

2 likes

Total Posts

23 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile

Avatar
Validate 1
Level 2
gregoryp3536631
Level 2

Likes

2 likes

Total Posts

23 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile
gregoryp3536631
Level 2

19-07-2016

Hi,

 

Did you find any solution?

Seems I face the same trouble... 🙂

Avatar

Avatar
Validate 1
Level 3
bobkranson
Level 3

Likes

8 likes

Total Posts

68 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 5
Boost 3
Boost 1
Applaud 5
View profile

Avatar
Validate 1
Level 3
bobkranson
Level 3

Likes

8 likes

Total Posts

68 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 5
Boost 3
Boost 1
Applaud 5
View profile
bobkranson
Level 3

22-07-2016

Sounds like backup tasks running.  Did you investigate the maintenance schedule configurations?  Which platform, *nix or Windows?  Have you monitored the server when this is happening?  Is it CPU saturation or network saturation?

Hope this helps...

Avatar

Avatar
Validate 1
Employee
Opkar_Gill
Employee

Likes

138 likes

Total Posts

952 posts

Correct reply

280 solutions
Top badges earned
Validate 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile

Avatar
Validate 1
Employee
Opkar_Gill
Employee

Likes

138 likes

Total Posts

952 posts

Correct reply

280 solutions
Top badges earned
Validate 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile
Opkar_Gill
Employee

22-07-2016

Hi,

With AEM 6.2 it automatically takes thread dumps and writes these to "crx-quickstart/threaddumps", you should analyse these or take thread dumps when your system starts slowing down.

Thread dumps should help you understand what is going on in your system and from here you can investigate whether the issue is with the OOTB or custom code.

Have you attached a profiler to the instance to see what how resources being consumed? You should do this monitoring at the OS level as well.

Regards,

Opkar

[0]https://docs.adobe.com/docs/en/aem/6-2/administer/operations/troubleshoot.html

Avatar

Avatar
Validate 1
Level 2
gregoryp3536631
Level 2

Likes

2 likes

Total Posts

23 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile

Avatar
Validate 1
Level 2
gregoryp3536631
Level 2

Likes

2 likes

Total Posts

23 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile
gregoryp3536631
Level 2

25-07-2016

Hi,

 

Thanks for the help.

The issue was coming from around 500 workflows that as been started and that crashed on the first step, so they were starting over and over again.

Seems that the loggin module was failing too.

 

Fixed!

 

Regards

Grégory

Avatar

Avatar
Level 1
pratikt68412688
Level 1

Likes

0 likes

Total Posts

2 posts

Correct reply

0 solutions
View profile

Avatar
Level 1
pratikt68412688
Level 1

Likes

0 likes

Total Posts

2 posts

Correct reply

0 solutions
View profile
pratikt68412688
Level 1

01-03-2017

        Can you please suggest how was this issue fixed for you. The steps you did, seem our environment is facing same issue Thanks In Advance

Avatar

Avatar
Level 1
pratikt68412688
Level 1

Likes

0 likes

Total Posts

2 posts

Correct reply

0 solutions
View profile

Avatar
Level 1
pratikt68412688
Level 1

Likes

0 likes

Total Posts

2 posts

Correct reply

0 solutions
View profile
pratikt68412688
Level 1

01-03-2017

        Can you please suggest how was this issue fixed for you. The steps you did, seem our environment is facing same issue Thanks In Advance

Avatar

Avatar
Give Back 5
Level 2
krishc76025392
Level 2

Likes

4 likes

Total Posts

19 posts

Correct reply

0 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back
Ignite 5
Ignite 3
View profile

Avatar
Give Back 5
Level 2
krishc76025392
Level 2

Likes

4 likes

Total Posts

19 posts

Correct reply

0 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back
Ignite 5
Ignite 3
View profile
krishc76025392
Level 2

11-10-2017

we too have the same issue. how that could be resolved?

Avatar

Avatar
Validate 1
Level 3
bobkranson
Level 3

Likes

8 likes

Total Posts

68 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 5
Boost 3
Boost 1
Applaud 5
View profile

Avatar
Validate 1
Level 3
bobkranson
Level 3

Likes

8 likes

Total Posts

68 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 5
Boost 3
Boost 1
Applaud 5
View profile
bobkranson
Level 3

17-10-2017

I would suggest you look at the JMX event log and see what is running.  This post says it was solved by fixing the workflow that was crashing on first step.  Those workflow activities are generally logged as events.  If you do some due-diligence on your event logs you might find the offending reason for your slow downs.

system/console/jmx/org.apache.jackrabbit.oak%3Aid%3D1%2Cname%3D"Consolidated+Event+Listener+statistics"%2Ctype%3D"ConsolidatedListenerStats"

system/console/jmx/org.apache.sling%3Aname%3DEvent+processing+pool%2Cservice%3DThreadPool%2Ctype%3Dthreads

HTH.