Expand my Community achievements bar.

Adobe Analytics Champions applications are now OPEN! Applications are open through June 2nd. Click the link to learn more and apply.
SOLVED

Bot traffic filtering query

Avatar

Level 2

We are trying to analyse our previous data to filter out bots, got a list of IPs that could be potential ones.

we were planning to configure it as bot filter section.

But what if these IP does not remain bots in the future?

what are the chances? How often can an IP change?

1 Accepted Solution

Avatar

Correct answer by
Level 7

I would be interested to see how others deal with this also.

As a travel company, we have a large number of bots / competitors that are price scraping our site. In order to maximise compatibility between Adobe Tools (Workspace / Reports / Adhoc / Data warehouse etc) we have created a Virtual Report Suite that filters out as much traffic as possible. However, we do not actually filter on IP but rather on domain, Using this technique we have found that what we see is troublesome domains - rather than IP addresses.

Our main check is that we look at domains where we have >100 unique visitors in a week with a visits per visitor ratio of 1.0 and a conversion rate of 0%. This has also highlighted a few things to us:

User agent strings are often a good indicator of a bot

Adobes domain list is not awful but clearly not as as up to date as others

Query string parameters can also help identify a bot if they are manipulating on site search.

As for syncing this retrospectively with our data lake, well that's another problem 😕

Thanks

Dave

1 Reply

Avatar

Correct answer by
Level 7

I would be interested to see how others deal with this also.

As a travel company, we have a large number of bots / competitors that are price scraping our site. In order to maximise compatibility between Adobe Tools (Workspace / Reports / Adhoc / Data warehouse etc) we have created a Virtual Report Suite that filters out as much traffic as possible. However, we do not actually filter on IP but rather on domain, Using this technique we have found that what we see is troublesome domains - rather than IP addresses.

Our main check is that we look at domains where we have >100 unique visitors in a week with a visits per visitor ratio of 1.0 and a conversion rate of 0%. This has also highlighted a few things to us:

User agent strings are often a good indicator of a bot

Adobes domain list is not awful but clearly not as as up to date as others

Query string parameters can also help identify a bot if they are manipulating on site search.

As for syncing this retrospectively with our data lake, well that's another problem 😕

Thanks

Dave