Expand my Community achievements bar.

Webinar: Adobe Customer Journey Analytics Product Innovations: A Quarterly Overview. Come learn for the Adobe Analytics Product team who will be covering AJO reporting, Graph-based Stitching, guided analysis for CJA, and more!
SOLVED

[High Discrepancy] Traffic Metrics (Pageviews) between Adobe Analytics and Splunk (html_ref) Requests

Avatar

Level 5

Hi Everyone -

We have noticed high volume difference between Adobe Analytics Page views for a specific page (filtered to specific page) and Splunk html_ref requests for that page in server-side logging.

The difference is around 1:10 that means for every 100 Page views in Adobe Analytics we see 1000 html resource requests in Splunk. Please note: Splunk would log all resources but we've filtered to ".html" document requests only.

 

Has anyone else encountered this scenario? What would be the possible solution?

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

@abhinavpuri Did you checked real-time logs for splunk, when you are accessing that specific page? See how many html requests are getting generated. I don't know much about splunk, but i heard that it will generate lot of logs for each and everything. Maybe looking at the logs will help you, but there will be lot of logs, which contains other info regarding the page. Starting with the troubleshooting process by looking at the logs in real-time might be a good first step.

View solution in original post

5 Replies

Avatar

Correct answer by
Community Advisor

@abhinavpuri Did you checked real-time logs for splunk, when you are accessing that specific page? See how many html requests are getting generated. I don't know much about splunk, but i heard that it will generate lot of logs for each and everything. Maybe looking at the logs will help you, but there will be lot of logs, which contains other info regarding the page. Starting with the troubleshooting process by looking at the logs in real-time might be a good first step.

Avatar

Level 5

Hi @Krishna_Musku - Tried this approach as well to eliminate duplicate calls in Splunk logs.
Filtered down to specific HTML resource and my IP address and checked in real-time monitoring -
Well for every page-view call in Adobe Analytics - We can see single hit recorded in Splunk.

 

My assumption in this context is "The difference between Splunk logs and Adobe Analytics could be due to  - "

 

- We have Cookie Consent implemented on website with an average of 70% acceptance rate for analytics cookies so this would introduce ~30% difference in counts.

- We know some modern browser like Firefox have "Enhanced Tracking Protection ETP" feature implemented which blocks marketing / analytics tracking calls from client side itself.

- Adobe Analytics would not be able to track any programmatic access to resources for example: Postman or any other crawlers on website.

 

- Adobe Analytics has IAB based filtering implemented, so concurrent / pattern based hits will qualify/reported under bot traffic but Splunk will track all hits.

Avatar

Community Advisor

When you say:

 

Splunk html_ref requests for that page in server-side logging

 

Do you mean that you are taking your raw sever logs for analysis?

 

This will result in massive differences, since server logs record every hit to the page, from bots and backend pinging processes etc. Every possible hit is recorded in the server, even hits that don't result in a page view, or server side health checks... Server logs are important for understanding the load on your servers, but this is not a good candidate for trying to analyze traffic... 

Avatar

Level 5

Thank @Jennifer_Dungan I agree, I know it is impossible to calibrate both platforms but was trying to look for key difference/documentation/reference materoal that can help establish differences.

Avatar

Community Advisor

Right, I get it... but I am not sure that server logs are the best comparative... We use the free version of GA for comparatives.. it's not one to one, but it's usually within 10%... and we can monitor trends - are they both going up? Both going down? Both staying constant?