The same way we currently train any personalization engine using a "maximize revenue" setup – but allow the revenue to be reduced by negative events (even to below zero).
Cancel flows only have one reliably measurable metric for training – and it's the thing we don't want them to do. I can't train it to go with "not" canceling because a large portion of those events are non-actions (i.e. "did not cancel"). There are some buttons that are alternatives to canceling – but what if they come back after clicking those buttons? Same goes for if they navigate away to a non-cancel related page.
Training against negative events (like canceling a subscription) is a powerful application that many companies (like BlackBack) are starting to address. I'd prefer not to by a bunch of third party tools to work on something that Target Personalization would be perfect for with this one capability added. Additionally, this goes beyond just canceling as there are all kinds of negative or semi-negative events (like returns/exchanges, warranty claims, and calls to customer service) that we frequently optimize our site to prevent.
Note: A similar solution (albeit less dynamic) would be to allow to optimize for "minimum revenue".