Use Case: Audience Lab test group is created with one base segment "Segment-A" which is divided into two splits as "Target-90%" and "Control-10%".
Total segment size = 100
Based on the split -
Target Destination should get 90
Control Destination should get 10
actual result is:
Target Destination = 82
Control Destination = 18
Question: Why we are seeing such a discrepancy?
Following points will enlist on how Audience Lab split the numbers in the outbounded files:
- The splitting is done by computing a hash for the id (there's a precedence rule) of the user.
- Then the hash function is used to obtain the percent bucket in which the user will be split.
The hash function provides a good spread of the users, but for small numbers it cannot guarantee an exact split. The tests which have been done in development environment have shown a difference of +-2% when there were 1000 user in 2 equal buckets (50-50). Hence, the things will go worse when there's an order of magnitude between the buckets, and when the number of users are so low.
To conclude, the split will not be a 100% match with the input numbers and there will be always an error factor with the exported numbers.