Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

Same segment, different numbers in Warehouse and Workspace...

MattBlake
Level 1
Level 1

Hi, I have a Data Warehouse headscratcher... Hoping someone might be able to help me explain something...

 

The situation is this:

 

We have two props, one for 'full URL' (containing domain name, parameter strings) and one for 'path' (url excluding domain name and parameter strings).

 

We have a segment (let's call it Segment X) which is based around an assortment of 'full URL' starts with / contains values.

However, if I then create a wokspace table broken down by day, and compare it to a Data Warehouse table (just with visits, all visits, granularity day, therefore minimal scope for duplication) the numbers come back with quite substantial differences. Same metrics, same segment, same dimensions... different numbers.

 

If I then create a duplicate segment (let's call it segment Y) which is based around the same pages, but uses 'path' starts with / contains values instead of 'full URL', then the numbers in workspace and Data Warehouse match up.

 

Does anyone have any ideas about ways in which Workspace and Data Warehouse process data that means the same segment could work in a slightly different fashion? So essentially the same report gives different figures? I was thinking about stuff like character limits having an impact (as URLs are longer than paths)?

1 Reply
VaniBhemarasetty
Employee
Employee

@MattBlake Yes, you are right in your understanding, In workspace there is a limitation to the size of values , since you are referring to prop here, the maximum size of the prop is 100 bytes, if the value is greater than 100 bytes , it strips the value to 100 and rest is ignored and hence if the url has query string parameters and character size exceeds 100, those extra values after 100 will be ignored.

However, in data warehouse that is not the case, you will be able to report the whole value,with out the size limit 

 

Now, when you are creating a segment with "Full URL" and you conditions says contains "query string parameter"

 

There are higher chances the population of segment will be less, because this "query string parameter" is stripped during processing and is longer there in analytics.

However, when the same segment is used in datawarhouse, it will have higher population because the "query string parameter" is there

 

Hope this explains