Expand my Community achievements bar.

SOLVED

Hash collision reporting?? Increased unique limit value for Tracking Code to 1 million (from 500,000), how can I know if a hash collision happens?

Avatar

Level 2

Looking for Hash collision reporting:

Recently we Increased unique limit value for Tracking Code to 1 million (from 500,000).  Is there a way to know if this hash collision happens during processing for Adobe Workspace?  Thinking out loud, comparing a report from Data Warehouse to Workspace? Any other ideas?

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hey @ddierking 

 

Great scenario!

 

I believe there's no such feature to find out whether a Hash Collision occured while processing the data for reporting through Workspace.

 

Your thought about comparing Workspace vs Warehouse reports, sounds good to me.

 

However, we may ask Adobe product team to share stats around what % of collisions we can expect with 500K and 1M unique values generally. That stands as a benchmark to identify false-positives, while using Analytics data for major business insights.

 

Best,

Kishore

View solution in original post

10 Replies

Avatar

Community Advisor and Adobe Champion

Dear ddierking,

Can you explain more about Hash Collision?

Are you trying to get an alert or some notification from Adobe whenever Tracking Code's unique limit value reaches 1 million? If yes, afraid that it is not possible.

Thank You, Pratheep Arun Raj B (Arun) | NextRow DigitalTerryn Winter Analytics

Avatar

Correct answer by
Community Advisor

Hey @ddierking 

 

Great scenario!

 

I believe there's no such feature to find out whether a Hash Collision occured while processing the data for reporting through Workspace.

 

Your thought about comparing Workspace vs Warehouse reports, sounds good to me.

 

However, we may ask Adobe product team to share stats around what % of collisions we can expect with 500K and 1M unique values generally. That stands as a benchmark to identify false-positives, while using Analytics data for major business insights.

 

Best,

Kishore

Avatar

Community Advisor

Are you referring to the Low Traffic thresholds? https://experienceleague.adobe.com/docs/analytics/technotes/low-traffic.html?lang=en

Comparing with Data Warehouse or Data Feeds might help you discover which values have been bucketed under low traffic. In your case, these should be tracking codes that have very little traffic.

Avatar

Level 2

Here is a quick article on Hash collision.  

https://experienceleague.adobe.com/docs/analytics/implementation/validate/hash-collisions.html?lang=...

Essentially I want to know how often this happens (or at least an estimate).  

I really like  Kishore_Reddy idea, perhaps you Adobe could share stats around what % of collisions we can expect with 500K and 1M unique values generally. That stands as a benchmark to identify false-positives, while using Analytics data for major business insights.

Avatar

Employee Advisor

@Deleted Account Here are the numbers for you to know when the values will go into "low traffic"

For the first 500K values, we check if new keys have less than 100 instances in a day, if it has less than 100 instances it will go under low traffic and if it is more, it will come out of low traffic

For 1M values, we check if instances are less than 10 on a day.

Avatar

Community Advisor

@VaniBhemarasetty Thank you for this information. Really helpful to understand how we can reduce "low volume" row in workspace reports, with 1M unique values limit.

 

However, @ddierking was checking about "how do we know if hash collision happened" while Adobe is processing data for eVars and Props for a report.

 

So, in essence, even after increasing the limit to 1M unique values:

  1. Does the hash collision occur?
  2. If a collision occurs, how does an Adobe Analytics User or Admin know about it?
  3. If a collision occurs, how to find out the % of data (in terms of number of hash values) with collision?

    (For example: Out of 1M unique hash values, 10K values repeat themselves after they have been already assigned to a value in an eVar, then it would be 1% of hash collision.

In case there isn't such feature (point 2 and 3) available in Adobe Analytics, then it would be helpful to see a common benchmark that Adobe generally has has seen. Like below:

- It's common to see a hash collision for 3% of the hash values for an Analytics account 500K unique values

- It's expected to see a hash collision for 1% of the hash values for an Analytics account with 1M unique values

 

This way, a reporting analyst (working on Adobe Analytics reports for a business) can always consider that x% of false-positive data in their reports.

 

@ddierking Please feel free to correct me, if I have not outlined your query correctly

 

 

Best,

Kishore

Avatar

Employee Advisor

@Kishore_Reddy Thanks for more inputs on the query. Maybe you can submit this as an idea of how analytics can be notified when a hash collision occurs 

You can submit it here

https://experienceleaguecommunities.adobe.com/t5/adobe-analytics-ideas/idb-p/adobe-analytics-ideas

 

Avatar

Former Community Member

@ddierking Did you manage to find out the percentage of hash collisions?