Expand my Community achievements bar.

cpu.sh and diskusage.sh scripts causing issue in AEM 6.1 - environments going down frequently

Avatar

Former Community Member

The OOTB monitoring scripts (cpu.sh and diskusage.sh scripts)  under /libs/granite/monitoring are causing the environments to go down frequently in AEM 6.1. I have followed the below document to disable the scripts but I still see the error in my error.log

Document followed - CQ5.5: "Too many open files" error due to monitoring scripts

Could you please help me in resolving this issue.

Below are the logs for the same.

14.03.2016 02:50:13.498 *ERROR* [Process Executor for diskusage.sh] com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl Error while executing script /opt/publish/crx-quickstart/monitoring/diskusage.sh

java.io.IOException: Cannot run program "/opt/publish/crx-quickstart/monitoring/diskusage.sh" (in directory "/opt/publish/crx-quickstart/monitoring"): error=2, No such file or directory

        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)

        at java.lang.Runtime.exec(Runtime.java:620)

        at com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl$ProcessExecutor.run(ShellScriptExecutorImpl.java:307)

        at java.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: error=2, No such file or directory

        at java.lang.UNIXProcess.forkAndExec(Native Method)

        at java.lang.UNIXProcess.<init>(UNIXProcess.java:248)

        at java.lang.ProcessImpl.start(ProcessImpl.java:134)

        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)

        ... 3 common frames omitted

14.03.2016 02:50:13.500 *ERROR* [Shell Script Executor Thread for cpu.sh] com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl Unable to extract script 'cpu.sh' to '/opt/publish/crx-quickstart/monitoring/cpu.sh'

java.io.FileNotFoundException: cpu.sh

        at com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.extractScript(ShellScriptExecutorImpl.java:177)

        at com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.execute(ShellScriptExecutorImpl.java:112)

        at com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:99)

        at com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:158)

        at com.adobe.granite.monitoring.impl.ScriptConfigImpl$ExecutionThread.run(ScriptConfigImpl.java:209)

        at java.lang.Thread.run(Thread.java:745)

4 Replies

Avatar

Level 10

Can you post this question to the main AEM forums please:

Adobe Experience Manager

Avatar

Level 1

AEM 6.1

Assuming that you're monitoring CPU and Disk Usage via some third-party tool like Nagios, you just need to disable the scheduled job that's causing the script to run.

I'm pretty sure the error you posted above is because you followed the directions for a 5.5 installation when working in a 6.1 installation.

By deleting those nodes, you haven't stopped the job from attempting to run and now the job is running but there's no script at the other end for it to run.

In a 6.1 instance,

1. Log in as the admin user

2. Go to /system/console/configMgr

3. Search for com.adobe.granite.monitoring.impl.ScriptConfigImpl

You should find two configuration nodes under "Adobe Granite Monitor Handler"

They'll look something like:  com.adobe.granite.monitoring.impl.ScriptConfigImpl.cb56e988-9241-4bfc-934f-c908456ce9bc

In our case we had 4 configurations (2 for cpu.sh and 2 for diskusage.sh - evidently a service pack caused that and we didn't catch it when it happened)...  Anyway, just delete those configurations and that'll stop the scheduled job(s) (without a restart).

When you have two cpu.sh scripts running, you'll end up with 'java.io.IOException: Cannot run program "/opt/aem/aeminstall/crx-quickstart/monitoring/diskusage.sh" (in directory "/opt/aem/aeminstall/crx-quickstart/monitoring"): error=26, Text file busy' in your error.logs over and over and over.

We deleted two of the four configs and we were able to get that error out of our logs.

However, we monitor all of our system resources via Nagios, so we don't really need these internal application monitors consuming memory or cpu in the application (risking the "too many open files" problem), so we eventually just removed all the Adobe Granite Monitor Handler configs.

Hope that helps.

Avatar

Level 1

The root of this problem is OOTB, the scheduled jobs in AEM reference an incorrect path to the scripts.  It is not a problem with how AEM was installed, but rather a poorly configured default.  You can either delete the jobs in the OSGI config console, as explained above, or update them with the correct path to the scripts.

I chose to delete them, based on previous comments in this thread, as we are monitoring disk and CPU with a real monitoring system and do not require AEM to do such things.