Confidence Intervals / General Statistical Significance



Some of the most common questions we have when we look at reports of the form X per Y (usually a calculated metric) are around the statistical properties of that number. It would be extremely helpful to have, e.g., a standard error for that number.


For example, say we have an eVar for an advertising campaign and we want to compare revenue per visit for different campaign codes. SiteCatalyst could tell us that campaignA had revenue of $1,000 with 1,000 visits ($1.00/visit) and campaignB had revenue of $500 with 400 visits ($1.25/visit). But we get no (direct) information to help decide whether the $0.25 difference is statistically significant. For instance, if the two campaigns reached most of their respective revenue totals in just a single sale each, we could assume the difference is NOT significant.


Possible solutions:


1. Deeply integrate statistical reporting into SiteCatalyst. This is probably the most user-friendly option but is clearly a major change to the product. (And I don't even know what it really means.)


2. Store and report sum of squares data for events along with existing sums. This would be sufficient information to answer all the statistical significance questions I can think of (given enough data and appropriate independence assumptions). One downside is that users would need to know to properly interpret this information. Another difficulty is that the sum of squares value is dependent on the way the report is generated and how the data is being looked at. I would assume sum of squares across visits, but others might want sum of squares across unique visitors.


Solution #2 would be an important and worthy addition to SiteCatalyst, but I recognize the difficulty in implementing.