Segmentation: Bot Traffic Identification & Exclusion Tool

Avatar

Avatar
Contributor
Level 9
MichaelWon
Level 9

Likes

436 likes

Total Posts

91 posts

Correct reply

1 solution
Top badges earned
Contributor
Shape 50
Shape 25
Shape 10
Shape 1
View profile

Avatar
Contributor
Level 9
MichaelWon
Level 9

Likes

436 likes

Total Posts

91 posts

Correct reply

1 solution
Top badges earned
Contributor
Shape 50
Shape 25
Shape 10
Shape 1
View profile
MichaelWon
Level 9

09-01-2017

I need help identifying bot traffic because we get a ton of it; somewhere between 30 - 40% of our page views and 5-15% of visits.  I am not referring to known bots (Google Spider) or malicious bots attempting to take down our site or defraud us.  I am referring to third party scrapers coming to us for information.  This type of trafffic is not looked at negatviely because 1) it is not harmful to our site experience. 2) everyone does it. 3) it is difficult to police.

Because we get so much bot traffic, we spend a chunk of time identifying if swings in our KPIs are real or due to non-human traffic.  This slows us down considerably. The bots coming to our site use standard devices, user agent strings, operating systems, devices, and also change their IP addresses frequently.  I am able to qualitatively identify this traffic because of the following:

1. This traffic is typed/bookmarked.

2. This traffic never has any of our campaign parameters.

3. This traffic lands on pages that would not normally be a direct landing page (i.e. a specific product page)

4. This traffic is from the 'Other' device type.

5. Page Views = 1 per visit.

6. Visits = Visitors and visits is showing very high numbers, i.e > 1k when looking at captured IP addresses.

So, whoever is crawling our site is deleting their cookies on the same IP address and viewing a single page view.   See attached for a screenshot.

It would be great to somehow aggregate visits from different visiors (cookies) where certain behaviors are taking place.  For example: 

Exclude all 'Visitors' if

1. 'Any value' for a given variable (evar/prop) shows up more than X times.

AND

2. PVs per Visit for each visit <= 1

AND

3. Traffic Source for all visits is typed/bookmarked.

We can solve for this in SQL , but not sure its doable in Adobe.  Any thoughts?

12 Comments

Avatar

Avatar
Level 1
sumkumar
Level 1

Likes

0 likes

Total Posts

1 post

Correct reply

0 solutions
View profile

Avatar
Level 1
sumkumar
Level 1

Likes

0 likes

Total Posts

1 post

Correct reply

0 solutions
View profile
sumkumar
Level 1

05-02-2020

Filtering the suspicious IPs and User-agents and excluding them from our analysis (retrospectively). 

https://www.linkedin.com/pulse/bot-filtering-adobe-analytics-using-api-14-python-sumeet-kumar/

Avatar

Avatar
Give Back 5
Employee
trevorpaulsen
Employee

Likes

0 likes

Total Posts

43 posts

Correct reply

0 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back
Ignite 1
View profile

Avatar
Give Back 5
Employee
trevorpaulsen
Employee

Likes

0 likes

Total Posts

43 posts

Correct reply

0 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back
Ignite 1
View profile
trevorpaulsen
Employee

10-02-2021

Currently, we have a number of exchange partners who offer sophisticated bot detection capabilities for Adobe Analytics. I'd recommend checking out what our partners have to offer such as this one.