Expand my Community achievements bar.

SOLVED

Reason for Data Variance in Adobe Analytics Workspace

Avatar

Level 3

When pulling historical data in Adobe Analytics Workspace—specifically, when extracting monthly data for a past period such as 2023—I’ve noticed that the figures differ between when the data was pulled last week, yesterday, and today. This discrepancy occurs even though I'm not pulling data for recent dates but for a fully historical period. The segment, data extraction period, and all other settings remain identical, yet the numbers are inconsistent. (metrics are visits, entries)

The difference in numbers is usually minor, with a variance of 1 or 2, but in some cases, it can be slightly more, even exceeding 10. Could you please clarify why these discrepancies are occurring?

 

I understand that data might be subject to modifications, but I would like to know what factors contribute to these data revisions. Additionally, could you clarify how far back these modifications can affect historical data?

 

I consulted with ChatGPT, and it suggested that the following factors could cause data variations:

  • Data latency and processing times
  • Bot filtering
  • VISTA rules
  • Anomaly detection
  • Late arriving hits

However, after reviewing the relevant Adobe official documentation, I couldn’t find any specific mention of these factors leading to changes in fully historical data.

1 Accepted Solution

Avatar

Correct answer by
Level 1
The data variance you’re seeing in Adobe Analytics Workspace for historical periods is likely due to hash collisions, where unique data values map to the same identifier in Adobe's backend, causing slight discrepancies in repeated queries.

 

Possible reasons may be,

 

Adobe might retroactively filter new bot traffic, affecting historical data.
Delays in session data occasionally update historical records
Adobe may adjust historical data to correct anomalies.

 

I would suggest to try the below,

 

Run queries at the same time for consistency.
Use Data Warehouse or Report Builder

 

while hash collisions are likely the root cause of your data variance, monitoring data extraction processes closely and consulting Adobe when major discrepancies occur are good in this case

View solution in original post

3 Replies

Avatar

Level 3

Are you looking at specific metrics here? And what did you use to extract the data? (data warehouse, report builder).

Avatar

Level 3

I just used workspace. adobe customer service said it was due to the hash collision.

Avatar

Correct answer by
Level 1
The data variance you’re seeing in Adobe Analytics Workspace for historical periods is likely due to hash collisions, where unique data values map to the same identifier in Adobe's backend, causing slight discrepancies in repeated queries.

 

Possible reasons may be,

 

Adobe might retroactively filter new bot traffic, affecting historical data.
Delays in session data occasionally update historical records
Adobe may adjust historical data to correct anomalies.

 

I would suggest to try the below,

 

Run queries at the same time for consistency.
Use Data Warehouse or Report Builder

 

while hash collisions are likely the root cause of your data variance, monitoring data extraction processes closely and consulting Adobe when major discrepancies occur are good in this case