Expand my Community achievements bar.

Join us for the next Community Q&A Coffee Break on Tuesday April 23, 2024 with Eric Matisoff, Principal Evangelist, Analytics & Data Science, who will join us to discuss all the big news and announcements from Summit 2024!
SOLVED

Bot traffic filtering query

Avatar

Level 2

We are trying to analyse our previous data to filter out bots, got a list of IPs that could be potential ones.

we were planning to configure it as bot filter section.

But what if these IP does not remain bots in the future?

what are the chances? How often can an IP change?

1 Accepted Solution

Avatar

Correct answer by
Level 8

I would be interested to see how others deal with this also.

As a travel company, we have a large number of bots / competitors that are price scraping our site. In order to maximise compatibility between Adobe Tools (Workspace / Reports / Adhoc / Data warehouse etc) we have created a Virtual Report Suite that filters out as much traffic as possible. However, we do not actually filter on IP but rather on domain, Using this technique we have found that what we see is troublesome domains - rather than IP addresses.

Our main check is that we look at domains where we have >100 unique visitors in a week with a visits per visitor ratio of 1.0 and a conversion rate of 0%. This has also highlighted a few things to us:

User agent strings are often a good indicator of a bot

Adobes domain list is not awful but clearly not as as up to date as others

Query string parameters can also help identify a bot if they are manipulating on site search.

As for syncing this retrospectively with our data lake, well that's another problem :-/

Thanks

Dave

View solution in original post

2 Replies

Avatar

Correct answer by
Level 8

I would be interested to see how others deal with this also.

As a travel company, we have a large number of bots / competitors that are price scraping our site. In order to maximise compatibility between Adobe Tools (Workspace / Reports / Adhoc / Data warehouse etc) we have created a Virtual Report Suite that filters out as much traffic as possible. However, we do not actually filter on IP but rather on domain, Using this technique we have found that what we see is troublesome domains - rather than IP addresses.

Our main check is that we look at domains where we have >100 unique visitors in a week with a visits per visitor ratio of 1.0 and a conversion rate of 0%. This has also highlighted a few things to us:

User agent strings are often a good indicator of a bot

Adobes domain list is not awful but clearly not as as up to date as others

Query string parameters can also help identify a bot if they are manipulating on site search.

As for syncing this retrospectively with our data lake, well that's another problem :-/

Thanks

Dave

Avatar

Level 1

I believe the only solution could be user-agent and IP Filteres (IP Range).

 

I understand your concern that in future the IP range might not signify Bots anymore but that's too minuscule of a probability to be worried about. They need to become non-bot and also become customers for your worry to become true.

 

I do not agree that Query String parameters can be used as they are majorly caused by our own settings and scripts. therefore identification using query string parameters is not as reliable.