Expand my Community achievements bar.

Better Bot Blocking (Part 1): The Basics

Avatar

Employee

11/4/24

Adobe provides multiple ways to exclude data (that you have collected) from your final Analytics reporting data set.  The most common method in use today is the IAB Bot Filtering tool, which (by default) excludes any hits containing a User-Agent header that matches a list of “bot-based” headers provided by the IAB itself.  The tool also allows you to provide your own list of “custom” User-Agents or IP Addresses that you would want to exclude hits from.  

Using IAB Bot Filtering or other such tools is a great first step, but your Adobe Analytics implementation will still face two major problems:

  1. Not all incoming hits can be identified as bots by just examining the User-Agent header or the IP Address; some bots use ordinary User-Agent headers and/or can readily change their IP Addresses to mask their activity.
  2. Even if your IAB Bot Filtering setup is able to properly identify all of your incoming bot activity, that bot activity is identified only after an Adobe Analytics server call is sent out.

I personally haven't done any deep, wide-spread analysis on bot traffic or any other hits that end up being excluded from reporting via the various tools that Adobe provides, but I have seen cases where IAB-identified hits have amounted to thirty-three(!) percent of the Adobe server calls that a client has collected.   Yes, you read that right.  Thirty-three percent. 

To see your bot numbers, you would create a simple report that looks like the following:

kenmckell_0-1730734910842.png

The above example shows that during September 2024, 160 million hits in this report suite came from bots, while 320 million hits came from non-bots.    This report suite collected 480,000,000 hits overall, but a third of those hits ended up being practically useless!

If a large percentage of the server calls your implementation sends to Adobe comes from bots and bad IP addresses, feel free to ask yourself whether you need to collect such data in the first place.   I mean, after all, collecting fewer server calls – especially from useless sources – just makes sense from a budgetary perspective, right?!

If you think the answer to this obviously manipulative question is "yes", read on.  We can't do much to change the IAB tool itself, but, as an alternative solution, we can easily change your implementation to stop these bots from sending server calls to Adobe in the first place!

The rest of this blog post will show you the basics on how to prevent any Adobe server calls from being generated.  Future blog posts will demonstrate ideas on how to actually identify bots or easily group IP Addresses together.   Keep in mind that the following instructions, once followed, will not be retroactive – they will work for data that is collected only after you deploy such changes.

Blocking AEP Web SDK Server Calls

Adding the following line of code to the onBeforeEventSend property for the Adobe Experience Platform Web SDK

 

return false;

 

…will prevent the sendEvent command from generating a server call. That's it.

Now obviously, you wouldn't want this line of code to run all of the time; otherwise, your Adobe implementation wouldn't collect any data!  So be sure to add some additional logic that would cause that line of code to run under only certain conditions.  Here's a very simple example:

 

if(iDontWantToCollectThisData === true)
    return false;

 

The variable iDontWantToCollectThisData is merely a "MacGuffin" placeholder here – I include it only to demonstrate the type of boolean logic you could include to determine whether to run the return false statement or not.

Blocking AppMeasurement Server Calls

For AppMeasurement-based implementations, you will need to set the abort variable equal to true to prevent the AppMeasurement code from generating server calls.

 

s.abort=true

 

The usual place to set the abort variable would be in the doPlugins function, for example:

 

s.doPlugins = function() {
    if(iDontWantToCollectThisData === true)
        s.abort=true;
}

 

Blocking via Adobe Data Collection/Tags Rule Conditions

You can add similar logic to each of your Adobe Data Collection Tags rules via a Custom Code condition.  The following screenshots demonstrate a simple example of a condition that would need to be met when a pageLoad event takes place.

kenmckell_1-1730734910844.png

kenmckell_2-1730734910846.png

You could deploy something like the following code after clicking on the "</> Open Editor" button:

kenmckell_3-1730734910848.png

In this example, if a pageLoad event takes place, the rule’s condition must be met (return true) in order for the rule’s actions to occur.  If the condition returns false, none of the rule’s actions, which may include the deployment of an AppMeasurement or Web SDK server call - not to mention any other solutions you might consider deploying via Tags - will occur.

One drawback to the Tags approach is that you would probably need to add the same condition to all of your rules.  Doing so might require a little more time to setup depending on the complexity of your implementation.

Conclusion

Now that you understand why you should stop Adobe server call generation from the source and have some basic information on how to do it, feel free to experiment with your implementation and see what’s possible!   Be sure to also talk with your team members to find out what sources of data they would like to try blocking. 

As mentioned, forthcoming blog posts will introduce some extra utilities and features that should make bot identification easier to pull off.  So, stay tuned, and (in the meantime) happy blocking!

 

1 Comment