[High Discrepancy] Traffic Metrics (Pageviews) between Adobe Analytics and Splunk (html_ref) Requests | Community
Skip to main content
abhinavpuri
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
May 15, 2024
Solved

[High Discrepancy] Traffic Metrics (Pageviews) between Adobe Analytics and Splunk (html_ref) Requests

  • May 15, 2024
  • 2 replies
  • 1530 views

Hi Everyone -

We have noticed high volume difference between Adobe Analytics Page views for a specific page (filtered to specific page) and Splunk html_ref requests for that page in server-side logging.

The difference is around 1:10 that means for every 100 Page views in Adobe Analytics we see 1000 html resource requests in Splunk. Please note: Splunk would log all resources but we've filtered to ".html" document requests only.

 

Has anyone else encountered this scenario? What would be the possible solution?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Krishna_Musku

@abhinavpuri Did you checked real-time logs for splunk, when you are accessing that specific page? See how many html requests are getting generated. I don't know much about splunk, but i heard that it will generate lot of logs for each and everything. Maybe looking at the logs will help you, but there will be lot of logs, which contains other info regarding the page. Starting with the troubleshooting process by looking at the logs in real-time might be a good first step.

2 replies

Krishna_Musku
Community Advisor
Krishna_MuskuCommunity AdvisorAccepted solution
Community Advisor
May 15, 2024

@abhinavpuri Did you checked real-time logs for splunk, when you are accessing that specific page? See how many html requests are getting generated. I don't know much about splunk, but i heard that it will generate lot of logs for each and everything. Maybe looking at the logs will help you, but there will be lot of logs, which contains other info regarding the page. Starting with the troubleshooting process by looking at the logs in real-time might be a good first step.

abhinavpuri
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
May 16, 2024

Hi @krishna_musku - Tried this approach as well to eliminate duplicate calls in Splunk logs.
Filtered down to specific HTML resource and my IP address and checked in real-time monitoring -
Well for every page-view call in Adobe Analytics - We can see single hit recorded in Splunk.

 

My assumption in this context is "The difference between Splunk logs and Adobe Analytics could be due to  - "

 

- We have Cookie Consent implemented on website with an average of 70% acceptance rate for analytics cookies so this would introduce ~30% difference in counts.

- We know some modern browser like Firefox have "Enhanced Tracking Protection ETP" feature implemented which blocks marketing / analytics tracking calls from client side itself.

- Adobe Analytics would not be able to track any programmatic access to resources for example: Postman or any other crawlers on website.

 

- Adobe Analytics has IAB based filtering implemented, so concurrent / pattern based hits will qualify/reported under bot traffic but Splunk will track all hits.

Jennifer_Dungan
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
May 15, 2024

When you say:

 

Splunk html_ref requests for that page in server-side logging

 

Do you mean that you are taking your raw sever logs for analysis?

 

This will result in massive differences, since server logs record every hit to the page, from bots and backend pinging processes etc. Every possible hit is recorded in the server, even hits that don't result in a page view, or server side health checks... Server logs are important for understanding the load on your servers, but this is not a good candidate for trying to analyze traffic... 

abhinavpuri
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
May 15, 2024

Thank @jennifer_dungan I agree, I know it is impossible to calibrate both platforms but was trying to look for key difference/documentation/reference materoal that can help establish differences.

Jennifer_Dungan
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
May 15, 2024

Right, I get it... but I am not sure that server logs are the best comparative... We use the free version of GA for comparatives.. it's not one to one, but it's usually within 10%... and we can monitor trends - are they both going up? Both going down? Both staying constant?