Bot Traffic Through Raw Full URL | Community
Skip to main content
skatofiabah
Level 5
October 23, 2024
Question

Bot Traffic Through Raw Full URL

  • October 23, 2024
  • 2 replies
  • 2169 views

Hi All,

 

We have a system where we look at the Raw Full URL of the page, then the Operating System, then the IP Address in Adobe Analytics to identify bot traffic. We are getting a full URL with "?test=1" in the query string parameter that seems to be bot traffic. Are there other ways or query strings in the full raw URL that would indicate Bot Traffic too?

 

Thanks!

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

2 replies

PratheepArunRaj
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
October 23, 2024

Dear @skatofiabah ,

'?test=1' need not be a bot ideally; rather, somebody is testing the page load in production (it might be your QC team) or trying to clear Akamai cache (usually, we use the query string parameter). 

Pretty not sure that using query string parameter is a reliable method to identify BOT.

Thank You, Pratheep Arun Raj B (Arun) | NextRow Digital | Terryn Winter Analytics

Thank You, Pratheep Arun Raj B (Arun) | Xerago | Terryn Winter Analytics
Level 2
October 24, 2024

Hi @skatofiabah,

 

Agreed that the query string parameter would not be indicative of a bot. Personally, what we use is a mixture of User Agent, IP address, Countries, Click behavioural data, Timestamp and page views events to help identify the larger bots. I would also check the IAB filter to ensure that it is turned on (though it doesn't capture new/smaller bots).

Cheers,

Vernon

skatofiabah
Level 5
October 24, 2024

Hi @vernon_h,

What is User Agent? I don't see that dimension in Adobe. Are there any other ways to slice besides those dimensions above?

Level 2
October 28, 2024

Hi @vernon_h,

 

We will investigate this. What does a normal user agent look like vs. a bad user agent if we capture it in an eVar?


Hi @skatofiabah ,

A normal user agent will look something like this 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36'. You can also find out what is your personal user agent by opening dev tools and navigating to the console tab and typing navigator.userAgent. This should show you your own user agent. 


The first step I would recommend doing is as @jennifer_dungan has mentioned would be to capture these user agents as an eVar (they sometimes exceed the 255 character limit). Then evaluating to see if there's a way to identify the slightly more obvious bots - I would be looking out for linux OS, headless chrome and organisational names/urls. You can do a google search based on the user agent to get more information.

 

Two examples:
"mozilla/5.0 (x11; linux x86_64) applewebkit/537.36 (khtml, like gecko) headlesschrome/121.0.6167.0 safari/537.36"
"mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/126.0.0.0 safari/537.36 observepoint"

 

So far, we're only focusing on excluding the larger and more disruptive ones, smaller ones have legit looking user agents, but you'll need a couple of other metrics (as mentioned above) to identify them.

Cheers,
Vernon