Expand my Community achievements bar.

Discrepancy Between Clickstream Raw Data and report builder data

Avatar

Level 2

Wired has a variance of negative 2% with -5% during weekend 

and mobile has a varinace of positive 2%

any idea why would this happen ?

4 Replies

Avatar

Level 4

@RohitNa1 Assuming the % percentage difference provided is between the two reports, one thing to remember is that raw data will include any bot-related traffic. 

Avatar

Community Advisor and Adobe Champion

There's a few things that you need to keep in mind when using data feeds/clickstream.

 

The first is that the data has to be added to your data lake, there could be problems with the ingestion (for example, if you have an evar capturing a search term and it has special characters in it, it could impact how the rest of the row is processed, I've seen that in our data). So first check to make sure your rows are being loaded properly.

 

Second, data feeds don't include classifications or any settings applied to a VRS. If your report suite for report builder is a VRS, you need to replicate those segments in the clickstream data before trying to match the data. And if you have any classifications on your data in workspace, you need to build out those classifications too. 

 

Third, it is possible that your data can include other sources (such as data upload apis, data sources, etc). You can limit those with conditions such as "hit_source = "1"" and "exclude_hit = "0"".

 

 

Once you've checked those things, see if your data is still not lining up. Keep in mind that web analytics data is never going to be perfect. For ours, we have a daily variance between 0.01% and 0.6% from clickstream to workspace/report builder. This is a really small amount compared to some of the things I've seen in the past. If you can get it within a 1% variance, that's probably as close as you're going to get. 

 

If you still have too large of a variance, pick a single day (or even a single hour depending on your volume of traffic), and compare the data for that time period and see if you can find a pattern in why it isn't lining up to do further investigations. 

Avatar

Level 2

Thanks checked on all the points and looks fine
when i looked at device type the varaince is +2% for Mobile and -2% for desktop

so far no other patterns will keep exploring.

Avatar

Community Advisor and Adobe Champion

Have you checked to make sure that the device type identifiers are being used correctly in the data feeds? If it was just off by 2%, I would say there probably isn't anything systemic there. But since one device is up by the amount the other device is down. I would double check how you're grouping the device types.