Issue in AEM Instance showing High CPU Usage

Avatar

Avatar
Validate 1
Level 1
Hemalatha_Bodab
Level 1

Like

1 like

Total Posts

3 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile

Avatar
Validate 1
Level 1
Hemalatha_Bodab
Level 1

Like

1 like

Total Posts

3 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile
Hemalatha_Bodab
Level 1

23-09-2019

Hi All,

We are frequently experiencing the High CPU usage Issue with the AEM instances. Please find the Below Description.

1.We have Production Environment where we configured 4 Publish Instances and 2 Author Instances.

2.We configured AEM on App nodes.

3.Frequently ,we are Experiencing High CPU usage Load i.e CPU usage is Hitting 98% on one of the Publish instances and remaining for 10 hours nearly.

Also,at the same time we are not seeing any load on other Publish Instances.

4.We verified the Dispatcher level configurations and logs also,we are seeing normal request processing and there are no errors in Error logs.

5.While Analyzing the Threads Dumps for  that Time Instance we are seeing below threads in blocked state.

Thread dump 6/10
"ApacheSlingdefault_QuartzSchedulerThread" prio=5 tid=0x298 nid=0xffffffff waiting for monitor entry
     java.lang.Thread.State: BLOCKED
     at java.lang.Object.wait(Native Method)
     - waiting to lock <0x39d83e0e> (a java.lang.Object) owned by "null" tid=0x-1
     at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:311)

Thread dump 9/10

"sling-default-28" prio=5 tid=0x104d nid=0xffffffff waiting for monitor entry
     java.lang.Thread.State: BLOCKED
     at sun.misc.Unsafe.park(Native Method)
     - waiting to lock <
0x2a446ba5> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) owned by "null" tid=0x-1
     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
     at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
     at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
     at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)


Could anyone please Suggest any solution for this issue.

Accepted Solutions (1)

Accepted Solutions (1)

Avatar

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,134 likes

Total Posts

3,166 posts

Correct reply

1,079 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,134 likes

Total Posts

3,166 posts

Correct reply

1,079 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile
Jörg_Hoh
Employee

25-09-2019

Hi,

Short answer: these threads are not blocking anything or are not blocked by anything. This is just the implementation of the Object.wait() method, which manifests itself in this way.

You can essentially ignore these 2.

Answers (9)

Answers (9)

Avatar

Avatar
Ignite 1
Employee
aemmarc
Employee

Likes

184 likes

Total Posts

243 posts

Correct reply

92 solutions
Top badges earned
Ignite 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile

Avatar
Ignite 1
Employee
aemmarc
Employee

Likes

184 likes

Total Posts

243 posts

Correct reply

92 solutions
Top badges earned
Ignite 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile
aemmarc
Employee

24-09-2019

Careful using the word "deadlock" as that means something very specific.  I haven't seen a true case of deadlock in aem in quite some time.

Avatar

Avatar
Ignite 1
Employee
aemmarc
Employee

Likes

184 likes

Total Posts

243 posts

Correct reply

92 solutions
Top badges earned
Ignite 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile

Avatar
Ignite 1
Employee
aemmarc
Employee

Likes

184 likes

Total Posts

243 posts

Correct reply

92 solutions
Top badges earned
Ignite 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile
aemmarc
Employee

27-09-2019

This thread tells you nothing.

You're looking at the wrong things. Look for what's running.

java.lang.Thread.State: RUNNABLE

Avatar

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,134 likes

Total Posts

3,166 posts

Correct reply

1,079 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile

Avatar
Coach
Employee
Jörg_Hoh
Employee

Likes

1,134 likes

Total Posts

3,166 posts

Correct reply

1,079 solutions
Top badges earned
Coach
Give back 600
Ignite 5
Ignite 3
Ignite 1
View profile
Jörg_Hoh
Employee

26-09-2019

This is a simple threadpool thread waiting for a job...

Avatar

Avatar
Validate 25
MVP
PuzanovsP
MVP

Likes

140 likes

Total Posts

543 posts

Correct reply

165 solutions
Top badges earned
Validate 25
Validate 10
Validate 1
Contributor 2
Ignite 10
View profile

Avatar
Validate 25
MVP
PuzanovsP
MVP

Likes

140 likes

Total Posts

543 posts

Correct reply

165 solutions
Top badges earned
Validate 25
Validate 10
Validate 1
Contributor 2
Ignite 10
View profile
PuzanovsP
MVP

26-09-2019

Hi Hema,

Have you looked into: /system/console/status-slingscheduler is there any job that's running when CPU is high?

Possibly there is a job that concurs with high load?

Regards,

Peter

Avatar

Avatar
Validate 1
Level 1
Hemalatha_Bodab
Level 1

Like

1 like

Total Posts

3 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile

Avatar
Validate 1
Level 1
Hemalatha_Bodab
Level 1

Like

1 like

Total Posts

3 posts

Correct reply

0 solutions
Top badges earned
Validate 1
Boost 1
View profile
Hemalatha_Bodab
Level 1

26-09-2019

Yes .We able to see numerous threads with 0x2a446ba5.Please find below snippet and let me know your findings

"sling-default-50" prio=5 tid=0x1109 nid=0xffffffff in Object.wait()

   java.lang.Thread.State: WAITING (on object monitor)

at sun.misc.Unsafe.park(Native Method)

- waiting to lock <0x2a446ba5> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) owned by "null" tid=0x-1

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)

at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Avatar

Avatar
Give Back 5
Employee
SonDang
Employee

Likes

17 likes

Total Posts

43 posts

Correct reply

9 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back 10
Give Back
Boost 5
View profile

Avatar
Give Back 5
Employee
SonDang
Employee

Likes

17 likes

Total Posts

43 posts

Correct reply

9 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back 10
Give Back
Boost 5
View profile
SonDang
Employee

24-09-2019

Can you find within the thread dumps to see which thread is holding onto this lock "0x2a446ba5"? This would tell you which thread is blocking this thread in question.

Avatar

Avatar
Ignite 1
Employee
aemmarc
Employee

Likes

184 likes

Total Posts

243 posts

Correct reply

92 solutions
Top badges earned
Ignite 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile

Avatar
Ignite 1
Employee
aemmarc
Employee

Likes

184 likes

Total Posts

243 posts

Correct reply

92 solutions
Top badges earned
Ignite 1
Give Back 50
Give Back 5
Give Back 3
Give Back 25
View profile
aemmarc
Employee

24-09-2019

BLOCKED threads are symptoms of a problem not the problem itself.

You need to look at the RUNNABLE threads over a series of thread dump captures to understand what's causing resource contention.

Also use a better way to capture the thread dumps (like the Jstack series script @JaideepBrar suggested). Whenever you see the TID as 0xffffffff it implies you used the same JVM thats buggered to capture the thread dumps. You need to launch jstack using a separate jvm than the one running AEM.

Avatar

Avatar
Contributor
Employee
hamidk92094312
Employee

Likes

103 likes

Total Posts

240 posts

Correct reply

38 solutions
Top badges earned
Contributor
Shape 1
Ignite 1
Give Back 50
Give Back 5
View profile

Avatar
Contributor
Employee
hamidk92094312
Employee

Likes

103 likes

Total Posts

240 posts

Correct reply

38 solutions
Top badges earned
Contributor
Shape 1
Ignite 1
Give Back 50
Give Back 5
View profile
hamidk92094312
Employee

23-09-2019

Usually a deadlock could cause high CPU usage. You can check the following document where you can find info on how to analyze sloe and blocked processes:  Analyze slow and blocked processes

Avatar

Avatar
Coach
Employee
jbrar
Employee

Likes

389 likes

Total Posts

869 posts

Correct reply

283 solutions
Top badges earned
Coach
Establish
Give Back 50
Give Back 5
Give Back 3
View profile

Avatar
Coach
Employee
jbrar
Employee

Likes

389 likes

Total Posts

869 posts

Correct reply

283 solutions
Top badges earned
Coach
Establish
Give Back 50
Give Back 5
Give Back 3
View profile
jbrar
Employee

23-09-2019

When the CPU usage is high, take atleast 10 thread dumps 3 seconds apart using the script at [1]. This script will take Top output as well which can be used to map threads to CPU usage.

Then, use the fastthread to analyze those dumps. You will see a section about "CPU Consuming Threads" which will provide you more details on the threads consuming more CPU.

[1] jstackSeries.sh/jstackSeries.sh at master · cqsupport/jstackSeries.sh · GitHub

[2] Smart Java thread dump analyzer - thread dump analysis in seconds