I am planning to delete large amount of workflow instances. When I had tried this earlier it had taken more than 17 hours to delete 14506560 nodes. Is there way to speed this up? I was told that this is because of incremental reindexing.. Can we stop incremental reindexing
I tried this and options to count the active, completed workflow. It worked great for those. But I am getting GATEWAY TIMEOUT error in middle of purge operation. I have about a million completed workflows.
Not sure if you have tried this, but finish all the running workflows don't trigger any new ones for some time and block all external excess to your server.
Now, go to http://<ip>:<port>/crx/explorer and then make a note of following node /etc/workflow/instances/server0, take a note of all properties on your server<num-val> sling folder node and delete it using crx/explorer view. You can also delete date folders one at a time or delete the parent node which is server<num-val> sling folder.
Considering the number of the nodes you have, the time it will take to delete all the instances of workflow will be high, so make sure you do this using the ip and port URL and not through apache URL if you have one. Apache is probably the one forcing you the gateway timeout error as apache expects the response in timely order.
Once deletion succeeds, then crx/explorer will enable the "Save All" button. Click on "Save all".
Now recreate the folder server<num-val> with exact same properties as were there earlier, ideally only one property needs to be right which is slingId.
Trigger any workflow to make sure that new instances are getting created fine under the newly created folder.