Note
In part one of this series, I provided the reason why you would want to prevent collecting data from bot traffic and described the basic process on how to do so. This article provides a solution that helps you identify spam-like activity on your site that isn’t explicitly recognized as coming from a bot but nevertheless is worth blocking because of its useless nature.
Introduction
Many clients that Adobe Consulting has worked with over the years have noticed web traffic patterns that could not be identified as explicit bot activity yet appeared robotic "enough" to warrant a second look. In many of the cases, the patterns showed normal looking "users" that would continually (and very quickly) hit a page over and over and over again to the point where they would generate thousands(!) of server calls without any recognizable reason for doing so.
In response, my Adobe colleague, Jeff Branson, produced an ingenious solution to this problem and called it the Hit Governor. This blog post covers the latest version of the Hit Governor, which we recommend adding to your Adobe Data Collection toolkit.
Disclaimers:
- This post demonstrates an AEP Web SDK-based solution as deployed through Adobe Data Collection Tags. Please refer to the official AEP Web SDK documentation for ideas on how to adopt this solution without using Tags
- This specific solution is not available for Mobile Apps (i.e., the AEP Mobile SDK) as it uses Web/JavaScript-based technology only.
What is the Hit Governor?
The Hit Governor is a JavaScript function – colloquially referred to as a plugin – that determines whether a device has sent at least "X" Adobe server calls within the last "Y" seconds (e.g., sixty hits within the last sixty seconds). If the device meets such a threshold, the Hit Governor's code will flag that particular device as a "bot". With the flag in place, the code will then prevent that "now-identified-as-a-bot" device from sending out any more Adobe server calls for the next "Z" days.
The use of localStorage
The original version of the Hit Governor code uses cookies to flag bad devices, but ITP and other privacy restrictions have caused cookies to fall out of favor. The version of the Hit Governor in this article uses localStorage instead to flag each device. Unlike cookies, localStorage is not included with each http request and thus is a safer, more-privacy-conscious option to use in cases where only the local client – and not the server - needs to access a device's web storage.
A drawback to using localStorage is that each instance of it is exclusive to each (sub)domain. Practically speaking, this means that data collection on only the (sub)domain where the suspicious activity took place will be blocked. Any other (sub)domains that belong to your Adobe Data Collection instance could still capture server calls from suspicious users unless, of course, the users engage in suspicious activity on the other (sub)domains as well.
Implementation
To implement the Hit Governor via Adobe Data Collection Tags and the AEP Web SDK, do the following:
Step 1: Create a "Library Loaded (Top-of-Page)" Rule in Tags
In every Tags implementation that I am involved with, I add a rule that fires at the same time the Tags library begins to load. This rule will fire unconditionally and deploy helpful JavaScript functions or other code that the general data collection solution might need to use at any time. Here are the settings I recommend you use when you create your rule:
- Name: Library Loaded (Page Top) ~~ No Conditions ~~ Deploy Global Function
- Event:
- Extension: Core
- Event Type: Library Loaded (Top of Page)
- Conditions: None
- Action:
- Extension: Core
- Action Type: Custom Code
- Language: JavaScript
- Execute Globally: Checked
To finish setting up the Action, click on the "</> Open Editor" button and copy+paste the following block of code into the code editor:
(function(){let h=globalThis.window||this;localStorage.getItem("hg_fexp")&&(new Date).getTime()>Number(localStorage.getItem("hg_fexp"))&&(localStorage.removeItem("hg_fexp"),localStorage.removeItem("hg_flag"));h.hitGovernor=function(a,c,d){a=a||60;c=c||60;d=d||60;if(localStorage.getItem("hg_flag"))localStorage.getItem("hg_flag")==="flagged"&&(a=new Date,a.setTime(a.getTime()+d*864E5),localStorage.setItem("hg_flag","1"),localStorage.setItem("hg_fexp",a.getTime().toString())),localStorage.removeItem("hg_hitt"),localStorage.removeItem("hg_hitc");else{var b=localStorage.getItem("hg_hitc")||"",e=Number(localStorage.getItem("hg_hitt"))||0;d=(new Date).getTime();b=b!==""?b.split("|").map(Number):[0,0,0,0,0];var k=b.reduce(function(f,g){return f+g},0);c=e===0?0:Math.floor((d-e)/(c/6)/1E3);e===0&&localStorage.setItem("hg_hitt",d);if(k<a) {if(c>0){if(c>5)b=[0,0,0,0,0];else for(a=0;a<c;a++)b.unshift(0),b.pop(); localStorage.setItem("hg_hitt",d)}}else localStorage.setItem("hg_flag","flagged"); b[0]++;localStorage.setItem("hg_hitc",b.join("|"));return b.reduce(function(f,g){return f+g},0)}}})();
Click on "Save" to close the editor, then click on "Keep Actions" to close the Action setup. Once you have confirmed your rule looks something like the following…
…click on "Save to Library".
Step 2: Create an "AEP Web SDK: sendEvent complete" Rule in Tags
Tags allows you to deploy a callback function that runs after any AEP Web SDK sendEvent command is complete. For our purposes, the callback will run the hitGovernor function to determine whether the user/device has passed the threshold of sending too many server calls within a certain period of time.
Continue on from before by creating a new Tags rule with the following settings:
- Name: AEP Web SDK sendEvent complete ~~ No Conditions ~~ Run hitGovernor
- Event:
- Extension: Adobe Experience Platform Web SDK
- Event Type: Send event complete
- Conditions: None
- Action:
- Extension: Core
- Action Type: Custom Code
- Language: JavaScript
- Execute Globally: Checked
To finish setting up the, click on the "</> Open Editor" button and copy+paste this single line of code into the code editor:
hitGovernor();
Click on "Save" to close the editor, click on "Keep Actions", then click on "Save to Library".
Step 2.5 (Optional): Change the Hit Governor Settings
The hitGovernor function, by default, will flag devices that have sent out sixty server calls within a sixty-second period. Such devices will not send out any more server calls for the next sixty days.
If you want to change these thresholds, you may do so by adding the following three arguments to the hitGovernor function call:
- a (optional, integer): The hit count threshold, or the maximum number of server calls a device can send (within "b" number of seconds) before the code flags the user as a "bot".
- b (optional, integer): The hit time threshold, or the number of seconds in the past during which the function will count the number of server calls sent.
- c (optional, integer): The hit exclusion threshold, or the number of days a device will not be able to send out server calls after being flagged.
For example, if you want to flag devices that send out at least 100 server calls within an 80-second period and prevent them from sending any more server calls for at least 30 days, the code to deploy – refer to the second Tags rule above – will need to change to the following:
hitGovernor(100,80,30)
Be sure to carefully discuss and firmly establish the threshold settings - especially the hit exclusion threshold - with your team before deploying this code.
As mentioned in Part 1 of this blog series, the onBeforeEventSend property needs to "return false" in any cases where we want to prevent devices from sending out any AEP Web SDK server calls.
Once the onBeforeEventSend’s editor shows up, copy+paste the following line of code to anywhere in the code editor:
if(!!localStorage.getItem("hg_flag") && localStorage.getItem("hg_flag") === "1") return false;
The above logic checks for the existence of the localStorage flag (and the flag's value of “1”) before determining whether onBeforeEventSend should return false.
This is what my local machine's onBeforeEventSend property looks like right now:
Be sure to save the changes to the AEP Web SDK extension after making this change.
Step 4: Deploy the Code
Feel free to deploy and publish the two new Tags rules and the updated AEP Web SDK Extension to your Tags libraries as needed.
Step 5: Run the Code
The ability to test the Hit Governor at scale is not easy since it needs practically anonymous devices to meet the threshold settings before its code can even flag the devices as "bots" in the first place. Despite this annoyance, you can see the Hit Governor in action yourself by mimicking this type of behavior in your browser.
For example, on my local machine, I can use my browser's console (via the browser’s Developer Tools) to quickly and repeatedly call the sendEvent command…
alloy("sentEvent",{"type":"commerce.purchases","xdm":adobeDataLayer.getState()});
…to the point where I meet the Hit Governor's default "60-60-60" threshold. Once I reach the threshold, the following messages appear any time I try to call the sendEvent command again:
Apparently, my own code now thinks I’m a bot! And yes, all other devices that end up meeting the same threshold as I did will end up with a similar fate.
Conclusion
I hope you find this and my other posts about bot blocking to be helpful! I will probably have one more post about bot blocking later on but please post your questions below (in the meantime) about anything else you are more curious to learn about.
Thanks everyone!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.