Expand my Community achievements bar.

Join us January 15th for an AMA with Champion Achaia Walton, who will be talking about her article on Event-Based Reporting and Measuring Content Groups!
SOLVED

Performance impact of calculating Unique Count on a long string based Dimension

Avatar

Level 2

I am saving a unique identifier string in a dimension. The string is 64 characters long.

I will be creating a Calculated Metric on this dimension to get the unique count on this dimension. I will be using this Calculated Metric on the Adobe Analysis Workspace.

 

This dimension will have upto 350,000 unique values. I am concerned about the performance of Count function on this dimension, especially because of the long string size, and large number of unique values.

 

Is there going to be any significant performance impact on the Analysis Dashboard, if this metric is added and used?

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

The COUNT function should still work for you without any performance degradation. However, while I've used it with thousands of values before, I haven't used it with 30,000+ values, so I can't say for sure if Analysis Workspace will return an expected result.

If you don't need the exact numbers, you can also use the APPROXIMATE COUNT DISTINCT function. https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-referenc... That won't return the exact numbers all of the time, but if you're running a report to observe trends, then that function should work well for you.

View solution in original post

2 Replies

Avatar

Correct answer by
Community Advisor

The COUNT function should still work for you without any performance degradation. However, while I've used it with thousands of values before, I haven't used it with 30,000+ values, so I can't say for sure if Analysis Workspace will return an expected result.

If you don't need the exact numbers, you can also use the APPROXIMATE COUNT DISTINCT function. https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-referenc... That won't return the exact numbers all of the time, but if you're running a report to observe trends, then that function should work well for you.

Avatar

Community Advisor and Adobe Champion

I also would suggest the "Approximate Count Distinct" calculated metric.. in fact, I don't think "Count" would work, since Count is used for MetricsApproximate Distinct Count is used for Dimensions:

Jennifer_Dungan_1-1691100349661.png

 

I use Approximate Distinct Count on a Dimension that regularly has 160K rows per month without issue.