Hi @AnishKo
TL;DR; you cannot combine or mentally “average” controls across activities, and doing so is statistically invalid. Try increasing the control size (why not go with a 50/50 split per experience?) and your data should at least gain stastistical significance.
A 5% control group size is very small and is statistically fragile, especially when audiences differ in size, behavior, or baseline conversion. Your confidence fluctuates because the control sample is too small to stabilize variance.
Instead, each activity’s lift and confidence are valid only for that audience and that activity.
With your 5% control size, results are directionally indicative at best, but not decision-grade.
Hence, the variation is expected and is a consequence of underpowered controls, not an analytics error.
Cheers from Switzerland!