How can I prevent spam leads from entering Marketo?

Question

Hi folks,We're starting to experience some spam on our blog forms and are looking for a solve. I've seen articles about ReCaptcha and honeypots, but am not sure that either alone will solve our root issue. I'm hoping there's a combined approach that could solve our issue. We are proactively trying to address our global forms before any potential escalations in spam attack volume.My understanding (please correct me if I'm wrong) is that the ReCaptcha implemenation found here does not prevent leads from entering Marketo. Instead, the data from the ReCaptcha is webhooked into Marketo and appended to the Lead record. You can then use the data to delete spam leads through a flow.My understanding is also that honeypot fields are easy for a dedicated spammer to identify (even if they don't have an obvious name) and bypass. That said, this article implies that a honeypot can be used to prevent form submits from even happening - a desired result.Goal:Prevent Spam lead data from entering Marketo. This could look like spam leads not being able to submit Marketo forms OR preventing the data from form submits from reaching Marketo.This is to make sure that:Marketo's API is not impacted by sudden high inbound volumeCampaigns, etc do not trigger and impact the API - with the current system setup, they would have to be updated 1 x 1 to filter out leads flagged as spam by ReCaptcha dataPrevent system delays in triggers, etc. due to backlogPrevent the need for ongoing system cleansing for spam leads, especially if there is high volumeIs this a viable solution?Implement a hidden simple boolean true/false ReCaptcha field on the Marketo formInclude JavaScript similar to the honeypot article linked above, but for the ReCaptchaIf an automated spam script fills out the form, including the hidden ReCaptcha field, this will trigger the JavaScript to prevent the form from being able to submit OR filter out the data from ever reaching MarketoStandard non-Spam leads will not need to fill out the ReCaptcha (e.g. if ReCaptcha is TRUE, the lead is Spam) and will pass through to MarketoIf this is not possible, is there some way to use a proxy in tandem with Marketo forms to prevent syncing bad data to the system? Other solutions?Thanks so much for any help and ideas!Cheers,JuliaP.S.@Sanford Whiteman tagging you since I know you've been an invaluable resource on past ReCaptcha questions.

SanfordWhiteman · Accepted Answer

First, honeypot fields are ridiculous. Worse than useless. Anyone who continues to champion them just doesn't understand how forms work, or how the web (including malicious and legit actors) works in general.

Second, realize that reCAPTCHA never (in any system, not just Marketo) stops non-humans from submitting form data. It cannot ever do this, because malicious actors do not use JavaScript. reCAPTCHA relies on an end user fingerprint, generated using JS and verified on your server via webhook, to determine whether the submission was from a human or machine (or, in the latest v3, whether they tilt toward human or machine, instead of a binary distinction).

So reCAPTCHA can be used to intuit whether form data was submitted by human or machine, but it doesn't stop the data from being submitted.

Now, in Marketo, you have the less-than-optimal reality that once form data is submitted, a lead is upserted before any other inspection of the payload can be done. In other systems, the form data can be inspected, again after submission, but before it enters the next stage of processing. In Marketo you can only inspect and attempt to subvert/revert the actions done before you're given a chance to check the reCAPTCHA fingerprint.

When plugging reCAPTCHA into Marketo, you need to tune your workflow so that form intake processes work in serial, not in parallel, so you always have control over the next step. You need to make sure that not just reCAPTCHA, but other prerequisites like an SMTP verify webhook, have a positive outcome before letting people move to the next step (i.e. making robust use of Request Campaign and the Webhook is Called trigger). I just rolled out a robust reCAPTCHA implementation for a client that was a huge net win, because it taught them a lot about rogue processes they didn't even realize were running, in random order, on every form fill! The end result was a workflow that's (mostly) self-documenting and stops non-human leads from entering the system.

Is this a viable solution?

Implement a hidden simple boolean true/false ReCaptcha field on the Marketo form
Include JavaScript similar to the honeypot article linked above, but for the ReCaptcha
If an automated spam script fills out the form, including the hidden ReCaptcha field, this will trigger the JavaScript to prevent the form from being able to submit OR filter out the data from ever reaching Marketo
Standard non-Spam leads will not need to fill out the ReCaptcha (e.g. if ReCaptcha is TRUE, the lead is Spam) and will pass through to Marketo

No, this does not make sense. There's no such thing as a reCAPTCHA that operates entirely on the client side, and the last thing you want is a reCAPTCHA that acts more like a honeypot (i.e. is more fake)!

Pratyusha_Ram1 · Answer

We faced this issue multiple times over the last couple of months. Here's what we observed - initially, we saw a surge of 10-15k leads per day and these ended up as fake handraisers. This is what alerted us as we saw a sudden spike in the daily MQL reports.

Now, it's natural and that we looped in our Digital marketing team as the leads seem to be sourced from a particular form. They jumped right in with many resolutions -

Expansion of the existing honeypot solution
reCaptcha
IP and email domain blocking on the web forms

But, nothing seemed to stop the incoming spam leads. This is when we went on to compare the Google Analytics stats and noticed that the incoming web page hits didn't match the # of incoming spam leads. This arose a suspicion that the spam records could directly be hitting the MKTO endpoint.

We now have

a daily report of the incoming leads matching the spam leads criteria
a campaign to mark these incoming spam leads as invalid, so they are not processed and progressed to the next stages
an ongoing effort to delete these leads - not just from MKTO, but the integrated systems as well

But, this is not viable. So, we got on a call with MKTO team to discuss the issue and check if IP blocking or anything was possible. But, apparently not. They told us that a long-term solution is being implemented and will be rolled out in Q1 2020 (tentative :-(, this was 2 months ago!). They recommended that we delete the affected form and create a new one, but this is not going to make the system any less vulnerable. It takes a few seconds to try different form IDs as two are already exposed.

We are actively in discussion with MKTO CSM and Products team. Let me know if you'd like to discuss the details and I'm happy to jump on a call!

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded