Do you mean to say corruption happens due to garbage collection and the trigger event is when you check last known good configuration or you mean that you fix the corruption after reverting to the last good known configuration?
In any case, the root cause must be logged in error.log file(s).
Couple of questions:
Does the corruption happen with segmentstore or indexes or both?
Do you compact/compress the repo periodically, offline or online?
What commands/scripts do you use to check the last good known configuration/other tasks as you mentioned?
Can you make sure that oak-jar version that you use for maintenance tasks matches with CRX-oak versions in both author and publish instances?