Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.
SOLVED

Email notification : workflows

Avatar

Level 9

Hi All,

I wanted to write a code as below[Out Of the box AEM 5.6.1]:

- When the number of stale workflows in the system exceeds a particular number(for example say 5),
notification has to be sent out to some concerned email inbox/distribution list, stating that the 
number of stale workflows has exceeded the count of 5.

I visited few forums and I was told the below:

1] you can use JCR observation to achieve that. 
2] Another way is to use scheduling in Sling.

Can anyone provide additional inputs/pointers/relevant examples. Would be helpful.

1 Accepted Solution

Avatar

Correct answer by
Employee

Hi,

I'm curious why you're interested so much in stale workflows.  A stale workflow means that a workflow in the system with a running state has metadata stored indicating that there should be a sling job executing the workflow.  A workflow shows up as stale if the sling job system reports that the job in question is not found in the system.  This can sometimes occur in a system that is under high load where a started workflow has been persisted in the repository but the sling job has not yet been persisted.  Under these circumstances the workflow will become no longer stale as long as the system is not shutdown.  If a workflow has become stale for quite some time restarting it may be the best option as the job will be re-scheduled.

You cannot calculate a stale workflow by a simple JCR query, you must corelate those results with sling jobs.  Instead of writing your own code I would suggest using the workflow maintenance jmx bean found here [0].  It has an operation to find out how many stale workflows there are (optionally per model) and has an operation to restart stale workflows.

If you want to simulate stale workflows the best way to do it is pause the workflow sling job queue, run some workflows, then clear the queue.  This should delete the jobs that workflow is expecting to be in the system.

As for periodically making this call, you can use the sling scheduler for this, see [0].  It's relatively straightforward to do this and quite powerful, and provides a few options how to build/configure your scheduled job.

Hope this helps,

Will

[0] http://helpx.adobe.com/experience-manager/kb/workflow-monitor-via-jmx.html

[1] http://sling.apache.org/documentation/bundles/scheduler-service-commons-scheduler.html

View solution in original post

9 Replies

Avatar

Correct answer by
Employee

Hi,

I'm curious why you're interested so much in stale workflows.  A stale workflow means that a workflow in the system with a running state has metadata stored indicating that there should be a sling job executing the workflow.  A workflow shows up as stale if the sling job system reports that the job in question is not found in the system.  This can sometimes occur in a system that is under high load where a started workflow has been persisted in the repository but the sling job has not yet been persisted.  Under these circumstances the workflow will become no longer stale as long as the system is not shutdown.  If a workflow has become stale for quite some time restarting it may be the best option as the job will be re-scheduled.

You cannot calculate a stale workflow by a simple JCR query, you must corelate those results with sling jobs.  Instead of writing your own code I would suggest using the workflow maintenance jmx bean found here [0].  It has an operation to find out how many stale workflows there are (optionally per model) and has an operation to restart stale workflows.

If you want to simulate stale workflows the best way to do it is pause the workflow sling job queue, run some workflows, then clear the queue.  This should delete the jobs that workflow is expecting to be in the system.

As for periodically making this call, you can use the sling scheduler for this, see [0].  It's relatively straightforward to do this and quite powerful, and provides a few options how to build/configure your scheduled job.

Hope this helps,

Will

[0] http://helpx.adobe.com/experience-manager/kb/workflow-monitor-via-jmx.html

[1] http://sling.apache.org/documentation/bundles/scheduler-service-commons-scheduler.html

Avatar

Level 9

Hi All, 

Also had few other related queries/doubts as below : 

1] How do we simulate stale workflows in OOTB AEM 5.6.1 instance. 

2] Also, in CRXDE lite we have TOOLS->QUERY option. From that probably we can retrieve the names of all the workflows in the system which are stale. 
Trying on this.Am I correct? 

Not sure if we can somehow make use of this, to meet the requirement[email notification to be sent when the number of stale workflows exceeds "x" in the system] we are targeting.

Avatar

Level 9

Hi Will,

Thanks a lot for your reply.

1] Reason I am after stale workflows 

   Many times we see cq performance is slow and the cause is stale workflows.Being from a support background and new to CQ,
   wanted to bring in some improvisation here, because if we are notified if such count is more than say 'x',
   we can proactively take some steps from our end.I also completely agree that we should find out the cause for 
   workflows going stale rather than going in for this, but for now this implementation would be very much helpful.

2] "Instead of writing your own code I would suggest using the workflow maintenance jmx bean"

   I did go through this(http://localhost:4502/system/console/jmx/com.adobe.granite.workflow:type=Maintenance) but not getting as to 
   how I can make use of this for my objective(i.e, notification to administrators when such count increases to more than 'x')


3] "If you want to simulate stale workflows the best way to do it"
    
   Trying to understand few things - workflow sling job queue,delete the jobs that workflow is expecting.

   

4] "As for periodically making this call, you can use the sling scheduler for this"

   Trying to understand this as it is not clear to me.

5]  From the first paragraph what I could understand was that internally sling jobs run workflows. Trying to get a better understanding here.

Avatar

Level 9

Some analysis done as below : 

In order to differentiate between aborted/running/stale and other categories of worflows 

1] In content explorer[http://localhost:4502/crx/explorer/browser/index.jsp

under /etc/workflow/instances/<respective date>/<respective model number> 

found a property by name "status" of type "String" whose value is "ABORTED". 

So, is it something like this property could be made use of to write code to differentiate stale workflows from others? 

"JCR observation", is this the one it refers to ? 

2] Another way is to use scheduling in Sling 

Can you please provide some additional inputs on this, as I do not have much idea and exploring things.

Avatar

Level 9

Hi All,

In order to get some more clarity on sling jobs & workflows.

went through "CONCURRENT WORKFLOW PROCESSING" and "CONFIGURE THE QUEUE FOR A SPECIFIC WORKFLOW MODEL"sections in https://dev.day.com/docs/en/cq/current/deploying/performance.html

Avatar

Level 9

Hi All,

Any thoughts/pointers/examples on doubt #2[mentioned in above posts] in specific and others , will be helpful.

Avatar

Level 9

Hi All,

Any thoughts/pointers/examples on doubt #2[mentioned in above posts] in specific and others , will be helpful.

Avatar

Level 9

Hi Sham,

Thanks a lot for your reply.

But I am not able to understand as to how I can make use of the mbean's available to meet my requirement[i.,e to send out a notification to system administrators when the number of stale workflows exceeds 'x']. I am not getting how should I go about implementing this. If you can provide detailed inputs/thoughts/pointers, it will be really helpful.