Performance impact of calculating Unique Count on a long string based Dimension | Community
Skip to main content
Level 2
August 2, 2023
Solved

Performance impact of calculating Unique Count on a long string based Dimension

  • August 2, 2023
  • 1 reply
  • 908 views

I am saving a unique identifier string in a dimension. The string is 64 characters long.

I will be creating a Calculated Metric on this dimension to get the unique count on this dimension. I will be using this Calculated Metric on the Adobe Analysis Workspace.

 

This dimension will have upto 350,000 unique values. I am concerned about the performance of Count function on this dimension, especially because of the long string size, and large number of unique values.

 

Is there going to be any significant performance impact on the Analysis Dashboard, if this metric is added and used?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by yuhuisg

The COUNT function should still work for you without any performance degradation. However, while I've used it with thousands of values before, I haven't used it with 30,000+ values, so I can't say for sure if Analysis Workspace will return an expected result.

If you don't need the exact numbers, you can also use the APPROXIMATE COUNT DISTINCT function. https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-reference/cm-adv-functions.html?lang=en#concept_000776E4FA66461EBA79910B7558D5D7 That won't return the exact numbers all of the time, but if you're running a report to observe trends, then that function should work well for you.

1 reply

yuhuisg
Community Advisor
yuhuisgCommunity AdvisorAccepted solution
Community Advisor
August 3, 2023

The COUNT function should still work for you without any performance degradation. However, while I've used it with thousands of values before, I haven't used it with 30,000+ values, so I can't say for sure if Analysis Workspace will return an expected result.

If you don't need the exact numbers, you can also use the APPROXIMATE COUNT DISTINCT function. https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-reference/cm-adv-functions.html?lang=en#concept_000776E4FA66461EBA79910B7558D5D7 That won't return the exact numbers all of the time, but if you're running a report to observe trends, then that function should work well for you.

Jennifer_Dungan
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
August 3, 2023

I also would suggest the "Approximate Count Distinct" calculated metric.. in fact, I don't think "Count" would work, since Count is used for MetricsApproximate Distinct Count is used for Dimensions:

 

I use Approximate Distinct Count on a Dimension that regularly has 160K rows per month without issue.