Hello all our team has run into a major problem here.
Our main instance is not working, we've tried restarting it and it will work for a few minutes then eventually stall.
We did some troubleshooting and found that there are 22000 running workflows on our offloading servers (we have 2 so 11000 each).
I was thinking to do a curl and delete the etc/workflow/instances/server0 node from the back end hoping it will solve our problem.
Is this a viable method?
Also this is what the error.log keeps spitting out.
13.07.2016 11:46:42.636 *ERROR* [sling-default-3042-org.apache.sling.event.jobs.Job:ad251b44-a812-4e15-8a45-c51e155bd723-0] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.sling.event.impl.jobs.scheduling.JobSchedulerImpl@6a09566e : null
java.lang.NullPointerException: null
at org.apache.sling.event.impl.jobs.JobManagerImpl.addJobInteral(JobManagerImpl.java:537)
at org.apache.sling.event.impl.jobs.JobManagerImpl.addJob(JobManagerImpl.java:704)
at org.apache.sling.event.impl.jobs.JobManagerImpl.addJob(JobManagerImpl.java:267)
at org.apache.sling.event.impl.jobs.scheduling.JobSchedulerImpl.execute(JobSchedulerImpl.java:280)
at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:116)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)