Hash collision reporting?? Increased unique limit value for Tracking Code to 1 million (from 500,000), how can I know if a hash collision happens? | Community
Skip to main content
ddierking
Level 2
March 8, 2022
Solved

Hash collision reporting?? Increased unique limit value for Tracking Code to 1 million (from 500,000), how can I know if a hash collision happens?

  • March 8, 2022
  • 6 replies
  • 3417 views

Looking for Hash collision reporting:

Recently we Increased unique limit value for Tracking Code to 1 million (from 500,000).  Is there a way to know if this hash collision happens during processing for Adobe Workspace?  Thinking out loud, comparing a report from Data Warehouse to Workspace? Any other ideas?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Kishore_Reddy

Hey @ddierking 

 

Great scenario!

 

I believe there's no such feature to find out whether a Hash Collision occured while processing the data for reporting through Workspace.

 

Your thought about comparing Workspace vs Warehouse reports, sounds good to me.

 

However, we may ask Adobe product team to share stats around what % of collisions we can expect with 500K and 1M unique values generally. That stands as a benchmark to identify false-positives, while using Analytics data for major business insights.

 

Best,

Kishore

6 replies

PratheepArunRaj
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
March 9, 2022

Dear ddierking,

Can you explain more about Hash Collision?

Are you trying to get an alert or some notification from Adobe whenever Tracking Code's unique limit value reaches 1 million? If yes, afraid that it is not possible.

Thank You, Pratheep Arun Raj B (Arun) | NextRow DigitalTerryn Winter Analytics

Thank You, Pratheep Arun Raj B (Arun) | Xerago | Terryn Winter Analytics
Kishore_Reddy
Community Advisor
Kishore_ReddyCommunity AdvisorAccepted solution
Community Advisor
March 9, 2022

Hey @ddierking 

 

Great scenario!

 

I believe there's no such feature to find out whether a Hash Collision occured while processing the data for reporting through Workspace.

 

Your thought about comparing Workspace vs Warehouse reports, sounds good to me.

 

However, we may ask Adobe product team to share stats around what % of collisions we can expect with 500K and 1M unique values generally. That stands as a benchmark to identify false-positives, while using Analytics data for major business insights.

 

Best,

Kishore

yuhuisg
Community Advisor
Community Advisor
March 9, 2022

Are you referring to the Low Traffic thresholds? https://experienceleague.adobe.com/docs/analytics/technotes/low-traffic.html?lang=en

Comparing with Data Warehouse or Data Feeds might help you discover which values have been bucketed under low traffic. In your case, these should be tracking codes that have very little traffic.

ddierking
ddierkingAuthor
Level 2
March 9, 2022

Here is a quick article on Hash collision.  

https://experienceleague.adobe.com/docs/analytics/implementation/validate/hash-collisions.html?lang=en

Essentially I want to know how often this happens (or at least an estimate).  

I really like  Kishore_Reddy idea, perhaps you Adobe could share stats around what % of collisions we can expect with 500K and 1M unique values generally. That stands as a benchmark to identify false-positives, while using Analytics data for major business insights.

VaniBhemarasetty
Adobe Employee
Adobe Employee
March 9, 2022

@612565 Here are the numbers for you to know when the values will go into "low traffic"

For the first 500K values, we check if new keys have less than 100 instances in a day, if it has less than 100 instances it will go under low traffic and if it is more, it will come out of low traffic

For 1M values, we check if instances are less than 10 on a day.

Kishore_Reddy
Community Advisor
Community Advisor
March 10, 2022

@vanibhemarasetty Thank you for this information. Really helpful to understand how we can reduce "low volume" row in workspace reports, with 1M unique values limit.

 

However, @ddierking was checking about "how do we know if hash collision happened" while Adobe is processing data for eVars and Props for a report.

 

So, in essence, even after increasing the limit to 1M unique values:

  1. Does the hash collision occur?
  2. If a collision occurs, how does an Adobe Analytics User or Admin know about it?
  3. If a collision occurs, how to find out the % of data (in terms of number of hash values) with collision?

    (For example: Out of 1M unique hash values, 10K values repeat themselves after they have been already assigned to a value in an eVar, then it would be 1% of hash collision.

In case there isn't such feature (point 2 and 3) available in Adobe Analytics, then it would be helpful to see a common benchmark that Adobe generally has has seen. Like below:

- It's common to see a hash collision for 3% of the hash values for an Analytics account 500K unique values

- It's expected to see a hash collision for 1% of the hash values for an Analytics account with 1M unique values

 

This way, a reporting analyst (working on Adobe Analytics reports for a business) can always consider that x% of false-positive data in their reports.

 

@ddierking Please feel free to correct me, if I have not outlined your query correctly 🙂

 

 

Best,

Kishore

VaniBhemarasetty
Adobe Employee
Adobe Employee
March 10, 2022

@kishore_reddy Thanks for more inputs on the query. Maybe you can submit this as an idea of how analytics can be notified when a hash collision occurs 

You can submit it here

https://experienceleaguecommunities.adobe.com/t5/adobe-analytics-ideas/idb-p/adobe-analytics-ideas

 

April 15, 2022

@ddierking Did you manage to find out the percentage of hash collisions?

ddierking
ddierkingAuthor
Level 2
April 18, 2022

@17504104 no luck here.