Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

aem fail to start under heavy load

liaquathk607427
Level 1
Level 1

Hi Guys,

I have aem publish running on aws ec2 instance. There is a monit script to start and stop the aem service.

It was working all good. However once due to heavy load on the publish instance the aem service failed to start when new ec2 instance spun up as part of autoscaling.

Looking forward for you suggestions.

----------

check process aem with pidfile /srv/sfw/aem/crx-quickstart/conf/cq.pid

       start program = "/sbin/service aem start"

       stop program = "/sbin/service aem stop"

       if failed port 8080 then restart

-----------

The aem script

--------------------

CQ5_ROOT=/< >

CQ5_USER=< >

########

SERVER=${   }/crx-quickstart

START=${SERVER}/bin/start

STOP=${SERVER}/bin/stop

STATUS="${SERVER}/bin/status"

case "$1" in

start)

echo -n "Starting aem services: "

su - ${CQ5_USER} ${START}

touch /var/lock/subsys/aem

;;

stop)

echo -n "Shutting down aem services: "

su - ${CQ5_USER} ${STOP}

sleep 20

rm -f /var/lock/subsys/aem

;;

status)

su - ${CQ5_USER} ${STATUS}

;;

restart)

su - ${CQ5_USER} ${STOP}

su - ${CQ5_USER} ${START}

;;

reload)

;;

*)

echo "Usage: aem {start|stop|status|reload}"

exit 1

;;

esac

12 Replies
Jörg_Hoh
Employee
Employee

How was it failing? Do you have log data? The data you provided unfortunately does not give any indication what went wrong.

Jörg

liaquathk607427
Level 1
Level 1

Hi Jorg,

Thanks for your reply. The monit was unable to start the aem servcie.

This is the monit log which I have;

[AEST Sep  1 13:38:48] info     : monit: generated unique Monit id 815c50e52d068454185da5c072e65b81 and stored to '/root/.monit.id'

[AEST Sep  1 13:38:48] info     : Starting monit HTTP server at [localhost:2812]

[AEST Sep  1 13:38:48] info     : monit HTTP server started

[AEST Sep  1 13:38:48] info     : 'system_aem-publish-prd-blue-113' Monit started

[AEST Sep  1 13:38:48] error    : 'aem' failed, cannot open a connection to INET[localhost:8080] via TCP

[AEST Sep  1 13:38:48] info     : 'aem' trying to restart

[AEST Sep  1 13:38:48] info     : 'aem' stop: /sbin/service

[AEST Sep  1 13:38:52] info     : 'aem' start: /sbin/service

[AEST Sep  1 13:40:53] info     : 'aem' connection succeeded to INET[localhost:8080] via TCP

[AEST Sep  1 13:50:45] info     : stop service 'aem' on user request

[AEST Sep  1 13:50:45] info     : monit daemon at 3211 awakened

[AEST Sep  1 13:50:45] info     : Awakened by User defined signal 1

[AEST Sep  1 13:50:45] info     : 'aem' stop: /sbin/service

[AEST Sep  1 13:51:15] error    : 'aem' failed to stop

[AEST Sep  1 13:51:15] info     : 'aem' stop action done

[AEST Sep  1 13:54:24] info     : start service 'aem' on user request

[AEST Sep  1 13:54:24] info     : monit daemon at 3211 awakened

[AEST Sep  1 13:54:24] info     : Awakened by User defined signal 1

[AEST Sep  1 13:54:24] info     : 'aem' start: /sbin/service

[AEST Sep  1 13:54:25] info     : 'aem' started

[AEST Sep  1 13:54:25] info     : 'aem' start action done

[AEDT Oct 10 13:55:55] error    : 'aem' process is not running

[AEDT Oct 10 13:55:55] info     : 'aem' trying to restart

[AEDT Oct 10 13:55:55] info     : 'aem' start: /sbin/service

[AEDT Oct 10 13:56:25] error    : 'aem' failed to start

[AEDT Oct 10 13:58:25] info     : 'aem' process is running with pid 25063

liaquathk607427
Level 1
Level 1

Also java was under lot of pump due to which the aem service stopped and monit script failed to restart the aem services. As a result a manual intervention was required to properly stop and start the aem service.

chandu_t
Level 3
Level 3

My observation AEM start and stop will take long time, when under high load. But it should eventually come up. Do you have any AEM logs for that duration?

liaquathk607427
Level 1
Level 1

Hi Chandu,

No unfortunately I do not have any aem logs. Moreover the monit was not starting the aem. Scripts for which I have pasted above in the thread.

Jörg_Hoh
Employee
Employee

Just on the above logging statements it's impossible to guess what caused AEM not to start. Are there more logs of this process available (e.g outputs of the commands monit is using the background)?

Jörg

liaquathk607427
Level 1
Level 1

Hi Jorg,

The script which monit uses to start aem is pasted above. That's the only thing I see on the server.

Jörg_Hoh
Employee
Employee

Well, if there are no AEM logs available or the output of the start script itself (which might indicate JVM problems) I don't see a way thow I can help you resolving this issue 😕

Jörg

liaquathk607427
Level 1
Level 1

Hi Jorg,

I understand with logs it hard to point out a solution. However any suggestions on modifying the startup script pasted above.

Peter_Puzanovs
Community Advisor
Community Advisor

Dear Sir or Madam,

Why not setup a health-check for your Publisher and only start bouncing it with heavy traffic, once it's fully ready to serve your requests.

E.g. have it properly configured in Dispatcher or your preferred Load Balancer:

[1] Configuring Dispatcher

Alternatively, you could setup an additional bundle in AEM Publisher with very high startup priority that would prevent ANY requests to instance on non system urls, until instance is fully ready.

Regards,

Peter

liaquathk607427
Level 1
Level 1

Hi Peter,

Thanks for your reply, could you please elaborate on your second option pls..

Jörg_Hoh
Employee
Employee

Sorry, I don't feel comfortable to suggest any change without knowing or understanding the details of the problem.

Jörg