Can I avoid race conditions when using static lists as a task queue? | Community
Skip to main content
Level 4
February 26, 2019
Solved

Can I avoid race conditions when using static lists as a task queue?

  • February 26, 2019
  • 2 replies
  • 5964 views

Sanford Whiteman​​ and others gave me some great advice on how I could use lists as a work queue to work around Munchkin v2 limitations on what actions could be done when a lead was associated: When a lead is merged / associated how can I make previously anonymous activity send an email?

I'm looking for ways to avoid a race condition where people in a list may get processed more than once.

My goal is to put people into a list for specific campaigns, and use the list as a queue that will be flushed by "cron job" / scheduled campaigns that run once an hour, and also by triggered campaigns that hopefully happen more quickly. The triggers may not always happen, so the cron is the fallback to flush the queues regularly.

Imagine I have a Smart Campaign that finds people in a specific list, then in the flow, removes them from the list, and sends an email.

Could this SC be invoked twice at the exact same time e.g. once from a scheduled campaign, and another time from a triggered campaign? If that was the case, maybe the user gets 2 emails.

What if the user is removed from the list in between when the SC criteria makes them eligible and when the flow actually runs? Should I be re-checking list membership before sending the email? Are there any other gotchas here? I can't say only send the email once, b/c sometimes this is for transactional reasons so people may be queued up in the same list multiple times (but no more than 1 time at once).

Thanks!

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by SanfordWhiteman

Unfortunately, you're still going to have race conditions in this scenario. There simply isn't enough atomicity or transaction awareness (not in the transactional email sense but in the SQL transaction sense).

What you need is an Atomic Compare-and-Swap (CAS) or true stack to make this work. You could definitely accomplish this using a webhook and a key-value store with CAS abilities.

(Also, just FYI, I'm primarily@Sanford Whiteman​ -- you @'d one of my secondary accounts!)

2 replies

SanfordWhiteman
SanfordWhitemanAccepted solution
Level 10
February 26, 2019

Unfortunately, you're still going to have race conditions in this scenario. There simply isn't enough atomicity or transaction awareness (not in the transactional email sense but in the SQL transaction sense).

What you need is an Atomic Compare-and-Swap (CAS) or true stack to make this work. You could definitely accomplish this using a webhook and a key-value store with CAS abilities.

(Also, just FYI, I'm primarily@Sanford Whiteman​ -- you @'d one of my secondary accounts!)

Denise_Greenb12
Level 6
February 26, 2019

Hi Jon,

Re: "Could this SC be invoked twice at the exact same time e.g. once from a scheduled campaign, and another time from a triggered campaign? If that was the case, maybe the user gets 2 emails."

Sanford is right. However, it seems like you could minimize this possibility by adding the choice to the flow of each campaign - "Member of Smart Campaign is Not <The Other Campaign>.

Re: "What if the user is removed from the list in between when the SC criteria makes them eligible and when the flow actually runs? Should I be re-checking list membership before sending the email?" Are you worried the user who has been removed from the list will get the email when she shouldn't because she is no longer eligible by the time the flow runs? If so, re-checking list membership before sending the email is a reasonable approach.

Denise

SanfordWhiteman
Level 10
February 26, 2019

If so, re-checking list membership before sending the email is a reasonable approach.

But the lookup and the subsequent send are not interlocked, and there isn't a unified view of the database that's guaranteed to persist across the steps of a flow. (Think about how many flows would be broken if this kind of isolation were in place!)

This race condition is more like the classical examples from programming. It's a fine-grained example of how checking for a condition, then proceeding as if the condition is still true despite the surrounding system not making that guarantee, is ultimately unreliable -- however unlikely it seems that you'll run into the bad case.

One way to avoid this is to use a system that can deliberately invalidate the condition at exactly the same time (interlocked) it evaluates the condition. That guarantees that any later attempt to read the same condition will fail, even if it's only one clock tick later. Or you can use a system that uses at-most-once to pop something off a stack and guarantee to never pop it again.

Jon_WuAuthor
Level 4
February 26, 2019

Seems like without the ability to lock, you basically can't ever implement at-most-once. Outside of Marketo we use Pub/Sub, which has at-least-once delivery as is common with distributed systems, so we have to track each message ID centrally in MySQL with locking to avoid duplicate processing.

It seems like something similar to the list / flow I have in my screenshots in my other post are as close as I'm going to get in Marketo. Thanks for verifying, just wanted to make sure I wasn't missing some expert strategy. It would be kind of nice if Marketo couldn't run a specific SC for the same user more than once in parallel, but that's probably way too complicated in a distributed system.