Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
BedrockMission!

Learn more

View all

Sign in to view all badges

SOLVED

Author and Publish instance becomes unresponsive after deployment

bhawesh-dandona
Level 2
Level 2

Hi,

 

We are facing following two unusual environment issues in our AMS managed stage environments:

  • With deployment either of author or publisher instances becomes unresponsive and service restart is the only option to bring them back-up
  • Deployment takes longer time 30-40 minutes

After Analyzing logs we found that AEM restarts many of its services and sometimes these restart of services is multiple times. This is one of the reason for long deployment duration we are facing.

 

Our deployment package size is 13MB. On a fresh instance when I tried it took just 5 minutes to get everything installed and I did't notice any restart of services as well.

 

Many times one continuous exception we notice in logs when servers go unresponsive is "AuthenticationSupport service missing":

org.apache.sling.engine.impl.SlingHttpContext handleSecurity: AuthenticationSupport service missing. Cannot authenticate request. org.apache.sling.engine.impl.SlingHttpContext handleSecurity: Possible reason is missing Repository service. Check AuthenticationSupport dependencies.

 

We are working with AMS on this and a ticket is already opened with them.

 

AMS suggested few changes and we applied them. For example updated "Apache Sling Job Manager" config to 300 from 30(Default) to make sure there is an optimum delay before the jobs start. This was done as sling jobs were getting started before the sling event bundle was active and cleared everything under "/var/eventing/jobs".

 

They are still investigating the issue but I thought to share the details here as well in order to understand if anybody else is also faced/facing similar issue and how did they resolve it.

 

Any inputs here will really be helpful.

 

Thanks,

Bhawesh

1 Accepted Solution
SundeepKatepally
Correct answer by
Level 5
Level 5

Its a known issue, Adobe is looking for the solution. Below is the reason why instance is becoming responsive

It tries to restart all the dependent bundles that the deployed custom bundle(13 MB one) is referring. Let say , custom bundle is referring to 20 OOTB bundles, then it doesn't mean it restarts 21 bundles , it might restart around 400-500 bundles as the 20 bundles further depend on few other bundles(chain of dependencies).

 

All together deploying the custom bundle is equivalent to restarting the instance itself. Indeed what we observed is restart is faster than the waiting till bundles are restarting.

View solution in original post

8 Replies
SundeepKatepally
Correct answer by
Level 5
Level 5

Its a known issue, Adobe is looking for the solution. Below is the reason why instance is becoming responsive

It tries to restart all the dependent bundles that the deployed custom bundle(13 MB one) is referring. Let say , custom bundle is referring to 20 OOTB bundles, then it doesn't mean it restarts 21 bundles , it might restart around 400-500 bundles as the 20 bundles further depend on few other bundles(chain of dependencies).

 

All together deploying the custom bundle is equivalent to restarting the instance itself. Indeed what we observed is restart is faster than the waiting till bundles are restarting.

View solution in original post

bhawesh-dandona
Level 2
Level 2

Thanks @SundeepKatepally  for your response. Did you also encounter that sometimes it results in server becoming unresponsive?

 

I have worked with 6.4.2 version in two other projects but have not faced this issue there. Does it happen in any specific scenario? Moreover our author takes around 10-15 minutes to come back but publisher usually take 30-40 minutes.

SundeepKatepally
Level 5
Level 5
@ bhawesh-dandona - Its not sometime , every time it becomes unresponsive. Time taking for coming back purely depends on the server hardware.
bhawesh-dandona
Level 2
Level 2

Thanks @SundeepKatepally for your valuable inputs. It helps.

 

We also have a ticket raised with AMS to help us investigate the issue and its been few weeks now, but they have not informed about this product issue yet. It is now a business critical issue for us, as we are not sure what are other impacted areas, if any.

 

It seems like we have to live with it until they provide us a fix for the same.

bhawesh-dandona
Level 2
Level 2

@SundeepKatepallyAnother interesting thing to note is, on a fresh instance it takes only 5-7 minutes to complete the deployment.

 

If it's a consistent product issue, it should happen even on a fresh instance as well. I doubt if there are more conditions involved when this is supposed to occur. May be when you have lot of content/asserts available in the repository.

SundeepKatepally
Level 5
Level 5
@ bhawesh-dandona - Pattern is - depends on how many java packages are updated in the custom bundle. When i try to reinstall the same bundle it doest't take much time where as when installing a bundle which has mana java package changes then it takes more time.