Expand my Community achievements bar.

Shared session usage in AEM 6.3 (oh no, not again!)

Avatar

Level 2

For our client we are developing a specific health check which analyses a page output traversing the tree of child pages.

I am aware that a shared session usage is a well known anti-pattern (https://cqdump.wordpress.com/2015/03/02/aem-scaling-patterns-avoid-shared-sessions/ and https://github.com/joerghoh/cqdump/blob/master/examples/bundle/src/main/java/de/joerghoh/cq5/example...) in AEM and keeping this in mind I created this code:

Assuming that I have a list of nodes to check:

for (Node node : nodes) {

  Future f = completionService.submit(testResultFunction(node.getPath()));
  futures.add(f);
}

...

private Callable<String> testResultFunction(final String nodePath) {

   return () -> Templates.serviceResourceResolverFunction(Constants.ServiceUsers.READ_SERVICE,
   resourceResolverFactory, resourceResolver -> {

     String result = requestService.doGet(checkedPageUrl, resourceResolver).getContent();
     ...

     return result;
   });
}

Templates.serviceResourceResolverFunction() looks like this:

public static <R> R serviceResourceResolverFunction(String serviceName, @Nonnull ResourceResolverFactory resourceResolverFactory,
   @Nonnull Function<ResourceResolver, R> function) {

  Map<String, Object> map = Collections.singletonMap(ResourceResolverFactory.SUBSERVICE, serviceName);
  ResourceResolver resolver = null;
 

  try {

   resolver = resourceResolverFactory.getServiceResourceResolver(map);
     return function.apply(resolver);
  } catch (LoginException e) {

    LOG.error("Access denied", e);
    return null;
  } finally {

    if (resolver != null && resolver.isLive()) {

     resolver.close();
    }

  }

}

I am receiving an output which forces me to scratch my head over and over again:

01.03.2018 15:51:00.880 *WARN* [ForkJoinPool-2-worker-1] org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate Attempted to perform hasProperty while thread ForkJoinPool-2-worker-2 was concurrently writing to this session. Blocked until the other thread finished using this session. Please review your code to avoid concurrent use of a session. ...and so on...

In debugger i see that I receive  a new resourceResolver per every request made, and I am not writing anything to the session (?!), so I can only wonder what's the problem here.

Any help will be much appreciated!

Edited 09.03.2018 by Evgeny Tugarev

19 Replies

Avatar

Administrator

Jörg Hoh​ can you help this memebr?



Kautuk Sahni

Avatar

Employee Advisor

It is indeed not obvious from this code, what code is actually writing to these sessions. Can you enable write logging /[1]) and post the results? From the details what is written it should be possible to deduce the location of the code.

[1] What is writing to my Oak repository? | Things on a content management system

Jörg

Avatar

Level 2

Thanks for looking onto this issue. Here's the debug info:

error.log:

09.04.2018 12:00:09.818 *WARN* [ForkJoinPool-1-worker-5] org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate Attempted to perform hasProperty while thread ForkJoinPool-1-worker-0 was concurrently writing to this session. Blocked until the other thread finished using this session. Please review your code to avoid concurrent use of a session.

trace.log:

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-0] org.apache.jackrabbit.oak.jcr.operations.writes [session-2713] refresh

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-0] org.apache.jackrabbit.oak.jcr.operations.writes [session-96801] refresh

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-0] org.apache.jackrabbit.oak.jcr.operations.writes [session-96802] refresh

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-2713] refresh

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-96801] refresh

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-96802] refresh

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-2713] refresh

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-96801] refresh

09.04.2018 12:00:09.819 *TRACE* [ForkJoinPool-1-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-96802] refresh

Now this is a total mystery, a thread is refreshing several (?!) sessions... also, I expected that every thread to have a unique session after calling to resourceResolverFactory.getServiceResourceResolver().

Avatar

Employee Advisor

refresh() is considered as a write operation, although it does not write anything to the repo. I also did not find a refresh() call to the JCR session (which is more a less a 1:1 translation of a resourceresolver.refresh() call).

But anyways, this only should yield in an warn message if a session is shared between multiple threads.

But:

09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-96801] refresh  09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-96802] refresh  09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-2713] refresh  09.04.2018 12:00:09.818 *TRACE* [ForkJoinPool-1-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-96801] refresh

If you check the session names, you can see that there are sessions which are used by 2 threads (for example "session-96801" is used by worker-5 and worker-3. That means, that sessions are actually reused.

As you have written, that you get a new resolver inside every callable, there must be some kind of mapping issue between resource resolver and JCR session. I would expect that a new session is created for every resource resolver, but it looks like that there is some issue.

Can you instrument your code in a way, that you log:

* thread name

* session name (session.toString()

* resource resolver name (resolver.toString)

whenever you got this resolver? And btw can you apply SP2 to your AEM instance as well, just to eliminate any known issues which were present with the GA version?

Avatar

Level 2

2 Forum Admins: Please do not mark answer as correct before having a confirmation from the post author (me).

Avatar

Level 2

Hi Scott, yes I did, I am preparing a next round of debug info

Avatar

Level 2

Hi Jörg, as you advised, I installed SP2 (now I am running on 6.3.2.0) and added extended log info -> session.toString(), etc..

I can see that every thread is indeed using a separate session, but error messages still appear. Here's the log info:

error.log

11.04.2018 12:21:47.220 *WARN* [ForkJoinPool-5-worker-3] org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate Attempted to perform getItemOrNull while thread ForkJoinPool-5-worker-5 was concurrently writing to this session. Blocked until the other thread finished using this session. Please review your code to avoid concurrent use of a session.

11.04.2018 12:21:47.221 *WARN* [ForkJoinPool-5-worker-5] org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate Attempted to perform getItemOrNull while thread ForkJoinPool-5-worker-7 was concurrently writing to this session. Blocked until the other thread finished using this session. Please review your code to avoid concurrent use of a session.

11.04.2018 12:21:47.222 *WARN* [ForkJoinPool-5-worker-7] org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate Attempted to perform hasNext() while thread ForkJoinPool-5-worker-0 was concurrently writing to this session. Blocked until the other thread finished using this session. Please review your code to avoid concurrent use of a session.

11.04.2018 12:21:47.222 *WARN* [ForkJoinPool-5-worker-3] org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate Attempted to perform getItemOrNull while thread ForkJoinPool-5-worker-2 was concurrently writing to this session. Blocked until the other thread finished using this session. Please review your code to avoid concurrent use of a session.

trace.log

here I got the resolver:

11.04.2018 12:21:47.215 *TRACE* [ForkJoinPool-5-worker-2] acme.core.util.Templates Session: com.adobe.granite.repository.impl.CRX3SessionImpl@e44c16b, Resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@742db557, Thread ForkJoinPool-5-worker-2

11.04.2018 12:21:47.215 *TRACE* [ForkJoinPool-5-worker-7] acme.core.util.Templates Session: com.adobe.granite.repository.impl.CRX3SessionImpl@6209647f, Resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@5a6db36b, Thread ForkJoinPool-5-worker-7

11.04.2018 12:21:47.215 *TRACE* [ForkJoinPool-5-worker-0] acme.core.util.Templates Session: com.adobe.granite.repository.impl.CRX3SessionImpl@a149997, Resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@2312ad16, Thread ForkJoinPool-5-worker-0

11.04.2018 12:21:47.215 *TRACE* [ForkJoinPool-5-worker-3] acme.core.util.Templates Session: com.adobe.granite.repository.impl.CRX3SessionImpl@3034c0c0, Resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@21627c07, Thread ForkJoinPool-5-worker-3

11.04.2018 12:21:47.216 *TRACE* [ForkJoinPool-5-worker-4] acme.core.util.Templates Session: com.adobe.granite.repository.impl.CRX3SessionImpl@3b51daef, Resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@4f2319f8, Thread ForkJoinPool-5-worker-4

here I reuse the resolver:

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-3] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@21627c07

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-2] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@742db557

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-5] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@c2a0c2

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-6] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@6bfa6de0

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-7] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@5a6db36b

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-1] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@7ab52215

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-4] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@4f2319f8

11.04.2018 12:21:47.217 *TRACE* [ForkJoinPool-5-worker-0] acme.core.services.hc.PagesHealthCheck Using resource resolver: org.apache.sling.resourceresolver.impl.ResourceResolverImpl@2312ad16

and here are extra refresh(es) which cause a concurrency issue:

11.04.2018 12:21:47.219 *TRACE* [ForkJoinPool-5-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-2468] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-31742] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-31743] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-3] org.apache.jackrabbit.oak.jcr.operations.writes [session-31744] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-2468] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-31742] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-31743] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-5] org.apache.jackrabbit.oak.jcr.operations.writes [session-31744] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-7] org.apache.jackrabbit.oak.jcr.operations.writes [session-2468] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-7] org.apache.jackrabbit.oak.jcr.operations.writes [session-31742] refresh

11.04.2018 12:21:47.220 *TRACE* [ForkJoinPool-5-worker-7] org.apache.jackrabbit.oak.jcr.operations.writes [session-31743] refresh

11.04.2018 12:21:47.221 *TRACE* [ForkJoinPool-5-worker-7] org.apache.jackrabbit.oak.jcr.operations.writes [session-31744] refresh

I still have no clue where all these extra sessions are coming from. I will continue debug the problem, ideas or suggestions are very much welcome!

Avatar

Level 10

In your code - how many sessions are you opening? Looks like you are opening a session for each node.

Avatar

Level 2

Yes, for each node (nodePath) there is a test function (testResultFunction) which goes to executor.

Avatar

Level 10

I have never attempted to loop through many nodes and opening a session for each node. Therefore i am not sure if these log messages are normal. I recommend following Joerg's suggestion of placing a bundle on Google drive so community can test.

Avatar

Level 2

Thank you both for a quick feedback. I will start preparing a bundle and will update the thread as soon as I am done (in a next few days).

Avatar

Level 10

This thread brings up a great question - can a lot of JCR Sessions be opening without these messages?

Avatar

Level 2

Indeed, this would be interesting to know.

UPD: I created a sample project and put it in git: GitHub - etugarev/acme-hc: Sling health check which produces blocking threads

Running project should be pretty straightforward, otherwise, if you experience any problems, please let me know.

Regards,

Evgeny

Avatar

Employee Advisor

Hi Evgeny,

I can reproduce it here locally. Thanks for the work you put into this!

My best assumption is [SLING-7542] Avoid shared session/resource resolver usage - ASF JIRA  , which is not yet released. But you can try to build GitHub - apache/sling-org-apache-sling-i18n: Apache Sling I18N Support  and deploy it locally on your instance (my gut feeling says that it this version should work on AEM 6.3.2.0 as well). Can you try that?

regards,

Jörg

Avatar

Level 2

Hi Jörg,

I am happy the demo project worked out for you   I will try the approach you suggested tomorrow and update the thread with results.

Thanks again for a prompt feedback!

UPD: I grabbed apache-sling-i18n version 2.5.15-SNAPSHOT from sling repository and installed compiled bundle via console.

But it looks that the problem still there..

Regards,

Evgeny

Avatar

Level 2

2 Forum Admins: Please do not mark answer as correct or "assumed answered" before having a confirmation from the post author (me)

Avatar

Employee Advisor

Thanks for the feedback. In that case please raise a Daycare ticket for help and attach your sample code. Being able to reproduce it with a test will definitly help to speedup the problem resolution.

Thanks,

Jörg

Avatar

Employee Advisor

Can you extract that code into a custom bundle, so I can try that on my own system? In the meanwhile I will look at the logs your provided.

Jörg