Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

aem fail to start under heavy load

Avatar

Level 1

Hi Guys,

I have aem publish running on aws ec2 instance. There is a monit script to start and stop the aem service.

It was working all good. However once due to heavy load on the publish instance the aem service failed to start when new ec2 instance spun up as part of autoscaling.

Looking forward for you suggestions.

----------

check process aem with pidfile /srv/sfw/aem/crx-quickstart/conf/cq.pid

       start program = "/sbin/service aem start"

       stop program = "/sbin/service aem stop"

       if failed port 8080 then restart

-----------

The aem script

--------------------

CQ5_ROOT=/< >

CQ5_USER=< >

########

SERVER=${   }/crx-quickstart

START=${SERVER}/bin/start

STOP=${SERVER}/bin/stop

STATUS="${SERVER}/bin/status"

case "$1" in

start)

echo -n "Starting aem services: "

su - ${CQ5_USER} ${START}

touch /var/lock/subsys/aem

;;

stop)

echo -n "Shutting down aem services: "

su - ${CQ5_USER} ${STOP}

sleep 20

rm -f /var/lock/subsys/aem

;;

status)

su - ${CQ5_USER} ${STATUS}

;;

restart)

su - ${CQ5_USER} ${STOP}

su - ${CQ5_USER} ${START}

;;

reload)

;;

*)

echo "Usage: aem {start|stop|status|reload}"

exit 1

;;

esac

2 Replies

Avatar

Employee Advisor

How was it failing? Do you have log data? The data you provided unfortunately does not give any indication what went wrong.

Jörg

Avatar

Level 1

Hi Jorg,

Thanks for your reply. The monit was unable to start the aem servcie.

This is the monit log which I have;

[AEST Sep  1 13:38:48] info     : monit: generated unique Monit id 815c50e52d068454185da5c072e65b81 and stored to '/root/.monit.id'

[AEST Sep  1 13:38:48] info     : Starting monit HTTP server at [localhost:2812]

[AEST Sep  1 13:38:48] info     : monit HTTP server started

[AEST Sep  1 13:38:48] info     : 'system_aem-publish-prd-blue-113' Monit started

[AEST Sep  1 13:38:48] error    : 'aem' failed, cannot open a connection to INET[localhost:8080] via TCP

[AEST Sep  1 13:38:48] info     : 'aem' trying to restart

[AEST Sep  1 13:38:48] info     : 'aem' stop: /sbin/service

[AEST Sep  1 13:38:52] info     : 'aem' start: /sbin/service

[AEST Sep  1 13:40:53] info     : 'aem' connection succeeded to INET[localhost:8080] via TCP

[AEST Sep  1 13:50:45] info     : stop service 'aem' on user request

[AEST Sep  1 13:50:45] info     : monit daemon at 3211 awakened

[AEST Sep  1 13:50:45] info     : Awakened by User defined signal 1

[AEST Sep  1 13:50:45] info     : 'aem' stop: /sbin/service

[AEST Sep  1 13:51:15] error    : 'aem' failed to stop

[AEST Sep  1 13:51:15] info     : 'aem' stop action done

[AEST Sep  1 13:54:24] info     : start service 'aem' on user request

[AEST Sep  1 13:54:24] info     : monit daemon at 3211 awakened

[AEST Sep  1 13:54:24] info     : Awakened by User defined signal 1

[AEST Sep  1 13:54:24] info     : 'aem' start: /sbin/service

[AEST Sep  1 13:54:25] info     : 'aem' started

[AEST Sep  1 13:54:25] info     : 'aem' start action done

[AEDT Oct 10 13:55:55] error    : 'aem' process is not running

[AEDT Oct 10 13:55:55] info     : 'aem' trying to restart

[AEDT Oct 10 13:55:55] info     : 'aem' start: /sbin/service

[AEDT Oct 10 13:56:25] error    : 'aem' failed to start

[AEDT Oct 10 13:58:25] info     : 'aem' process is running with pid 25063

Avatar

Level 1

Also java was under lot of pump due to which the aem service stopped and monit script failed to restart the aem services. As a result a manual intervention was required to properly stop and start the aem service.

Avatar

Level 3

My observation AEM start and stop will take long time, when under high load. But it should eventually come up. Do you have any AEM logs for that duration?

Avatar

Level 1

Hi Chandu,

No unfortunately I do not have any aem logs. Moreover the monit was not starting the aem. Scripts for which I have pasted above in the thread.

Avatar

Employee Advisor

Just on the above logging statements it's impossible to guess what caused AEM not to start. Are there more logs of this process available (e.g outputs of the commands monit is using the background)?

Jörg

Avatar

Level 1

Hi Jorg,

The script which monit uses to start aem is pasted above. That's the only thing I see on the server.

Avatar

Employee Advisor

Well, if there are no AEM logs available or the output of the start script itself (which might indicate JVM problems) I don't see a way thow I can help you resolving this issue 😕

Jörg

Avatar

Level 1

Hi Jorg,

I understand with logs it hard to point out a solution. However any suggestions on modifying the startup script pasted above.

Avatar

Community Advisor

Dear Sir or Madam,

Why not setup a health-check for your Publisher and only start bouncing it with heavy traffic, once it's fully ready to serve your requests.

E.g. have it properly configured in Dispatcher or your preferred Load Balancer:

[1] Configuring Dispatcher

Alternatively, you could setup an additional bundle in AEM Publisher with very high startup priority that would prevent ANY requests to instance on non system urls, until instance is fully ready.

Regards,

Peter

Avatar

Level 1

Hi Peter,

Thanks for your reply, could you please elaborate on your second option pls..

Avatar

Employee Advisor

Sorry, I don't feel comfortable to suggest any change without knowing or understanding the details of the problem.

Jörg