I am saving a unique identifier string in a dimension. The string is 64 characters long.
I will be creating a Calculated Metric on this dimension to get the unique count on this dimension. I will be using this Calculated Metric on the Adobe Analysis Workspace.
This dimension will have upto 350,000 unique values. I am concerned about the performance of Count function on this dimension, especially because of the long string size, and large number of unique values.
Is there going to be any significant performance impact on the Analysis Dashboard, if this metric is added and used?
Solved! Go to Solution.
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
The COUNT function should still work for you without any performance degradation. However, while I've used it with thousands of values before, I haven't used it with 30,000+ values, so I can't say for sure if Analysis Workspace will return an expected result.
If you don't need the exact numbers, you can also use the APPROXIMATE COUNT DISTINCT function. https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-referenc... That won't return the exact numbers all of the time, but if you're running a report to observe trends, then that function should work well for you.
The COUNT function should still work for you without any performance degradation. However, while I've used it with thousands of values before, I haven't used it with 30,000+ values, so I can't say for sure if Analysis Workspace will return an expected result.
If you don't need the exact numbers, you can also use the APPROXIMATE COUNT DISTINCT function. https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-referenc... That won't return the exact numbers all of the time, but if you're running a report to observe trends, then that function should work well for you.
I also would suggest the "Approximate Count Distinct" calculated metric.. in fact, I don't think "Count" would work, since Count is used for Metrics, Approximate Distinct Count is used for Dimensions:
I use Approximate Distinct Count on a Dimension that regularly has 160K rows per month without issue.