Suppose I have the following breakdown imported from Data Warehouse
Dimension 1
| Dimension 2 | Recipe Id (Target) | Page Views | eVar Instances |
---|
A | D | B | XXXX | XXXX |
A | E | B | YYYY | YYYY |
A | F | G | ZZZZ | ZZZZ |
then I decide to remove Dimension 2 because It's not relevant for what I want to calculate, so I remove the column and aggregate the rows (using a tool like Excel or Python), leaving me with the following table
Before aggregation:
Dimension 1
| Recipe Id (Target) | Page Views | eVar Instances |
---|
A | B | XXXX | XXXX |
A | B | YYYY | YYYY |
A | G | ZZZZ | ZZZZ |
After aggregation:
Dimension 1
| Recipe Id (Target) | Page Views | eVar Instances |
---|
A | B | XXXX + YYYY | XXXX + YYYY |
A | G | ZZZZ | ZZZZ |
is this a valid operation? My intuition tells me that for Page Views and Instances this is valid since the rows are exclusive. On the other hand, if I had a Visits metric it may be invalid since XXXX and YYYY may share a same visit.
How about Recipe Id (Target) or any other Target dimensions? A hit may belong to many Target Activities (AB tests for example), but Target breakdowns are individual, so if I want all Dimension 1 Page Views for element A, it may be wrong to sum XXXX + YYYY + ZZZZ because there is a chance that all Page Views of A belong to both B and G, resulting in inflated numbers of Page Views.
Thanks for any help, I'm still learning about these concepts