Hi there, I have a unique problem. We have non-headless bot crawlers on our site that visit one specific page, leave, and then change their IP via iCloud's Privacy Relay, change their user agent, clear cache and cookies, and then re-visit the same page to scrape data.
Typically we would implement bot rules that exclude IPs from a given ISP/bot farm, but with iCloud's privacy relay, the gigantic pool of IPs being used overlaps with real customers! We have an exclusion segment in place, with talks of making a virtual report suite that also gets around these bots. Before we spend billable hours though, I just wanted to sanity check if anyone here would do it differently. Am I missing anything regarding bot rules, or have these bots truly stumped adobe's frontline capabilities? I know this isn't an adobe Launch forum, but surely there's nothing to be done there either without a mountain of (arguably) delicate JS.
While brainstorming, perhaps I could make a better exclusion segment if there was a way to get ISP data on workspace? I cannot find such, and just wanted to sanity check that too.
Thank you kindly!
PS: this is iClouds Privacy Relay IP address list (it's huge):
https://mask-api.icloud.com/egress-ip-ranges.csv
Solved! Go to Solution.
Views
Replies
Total Likes
I looked up a few of the IP addresses listed in an IP lookup tool... it indicated that the domain identifies as "akaquill.net"
If that holds true for all of them, can you see if Adobe has captured that in the Domain dimension? If so, you should be able to create segments based on the Domain that you can use on your virtual report suites.
I looked up a few of the IP addresses listed in an IP lookup tool... it indicated that the domain identifies as "akaquill.net"
If that holds true for all of them, can you see if Adobe has captured that in the Domain dimension? If so, you should be able to create segments based on the Domain that you can use on your virtual report suites.
Thank you Jennifer, the domain dimension value you refer to does not have a majority stake in this service, but it's nice to know I have exhausted all other possible solutions. I think the only other possibility is making a Launch rule condition that prevents network calls based on the bot's behavior, as a means to reduce server call costs, but these bots are a drop in the bucket outside of niche reports.
Honestly, sometimes all of Apple / Safari's "privacy" things are a real pain in the behind... Most web analytics isn't drilling down to look at a specific person, we are interested in trends... but their paranoia campaigns make everyone think that we're going to know everything about them and buy into this nonsense... making everything harder.....
Good luck... you can also try digging into the raw data to see if you can identify the traffic better?
You might also want to post a suggestion to Adobe to see if they can maintain some sort of list that all clients could access, as this might start to become a hot topic for other people as well.
Hi @cameron_tarle ,
Also have a look at below amazing blogs from @kenmckell where he explained the bot detection and blocking in detail,
Better Bot Blocking (Part 1): The Basics
Better Bot Blocking (Part 2): Identifying Bots and Leveraging CIDR
Better Bot Blocking (Part 3): The Hit Governor
These might give you an idea about how to better tackle this issue.
Cheers!
Thank you Haveer, part 3 is very underrated. We will look into that as a solution as I do not know if a chrome web driver is likely to remove cookies and local storage. Something to experiment with for sure.
We saw an increase in traffic like this in the fall. I know we didn't get all the traffic, but we noticed that a lot of them were coming in on old google mobile user agents. It brought our traffic down to expected levels.
Views
Likes
Replies
Views
Likes
Replies