Expand my Community achievements bar.

Join us for the next Community Q&A Coffee Break on Tuesday April 23, 2024 with Eric Matisoff, Principal Evangelist, Analytics & Data Science, who will join us to discuss all the big news and announcements from Summit 2024!
SOLVED

(Low Traffic) as number 1

Avatar

Level 2

Hi guys,

Quite a lot I am seeing "Low Traffic" value as a number 1 in the report result. I think I have read somewhere that the "Low Traffic" variable means --> values with small contribution. But most of the time the "Low Traffic" value contributes over 90%. (attached pict)

Any hint how I get rid of it? When I break it down, it shows "NONE". Hard to do any analysis on this data set.

 

Thanks,

 

Jakub

1 Accepted Solution

Avatar

Correct answer by
Employee

Hi Jakub,

Due to the high cardinality of search variations and how you are merging several values there is not a workaround aside from capturing the data at a higher up summarized level or splitting apart your string. Right now how the data is captured in one combined string will causally yield a high low traffic % in Reports & Analytics and is only viable for analysis in your raw clickstream in a big data type effort.

Best,

Brian

View solution in original post

3 Replies

Avatar

Employee

Hi Jakub,

The low traffic (uniques exceeded) designation for variable values is handled via an Adobe Analytics algorithm (methodology details are explained in the links below). It appears in this case you have extreme high cardinality if low traffic makes 90%. I would recommend reviewing the implementation to confirm if this is necessary and by design. Anything you can do within the implementation approach to cut down on the number of unique value variations will help, but if the current variable value granularity is required for the business then there may not be much to do in R&A.

http://blogs.adobe.com/digitalmarketing/analytics/high-cardinality-reports/

http://helpx.adobe.com/analytics/kb/uniques-exceeded.html

  • Reporting is still not affected if the variable does not reach 500,000 unique values in a given month.
  • When a variable reaches this first threshold of 500,000, data begins to be bucketed under (Low-Traffic). Each value beyond this threshold goes through the following logic:
    • If a value is already in reports, add an instance to that value.
    • If a value is not yet in reporting, check to see if that value was seen more than approximately ten times today. If it has, add this value to reporting. If it hasn't been counted more than about ten times, leave it under (Low-Traffic).
  • If a report suite reaches more than 1,000,000 unique values, more aggressive filtering is applied: 
    • If a value is already in reports, add an instance to that value.
    • If a value is not yet in reporting. Check to see if that value was seen more than approx. 100 times today. If it has, add the value to reporting. If it hasn't, leave it under (Low-Traffic).

Best,

Brian

Avatar

Level 2

Hi Brian,

thanks for the reply, I kinda thought this could be the reason. Yes, there
is a large volume of unique instances.

The strings come from "SEARCH", so pretty much every search is different,
hence the large volume. The string is constructed in processing rules. -->
(if search triggered the string gets country|state|area|pricefrom|priceto|
and so on....) so its very unlikely that the search instance would be same.

I am not sure how to change the implementation. The site has a volume of
300K+ of UV's a day, so on an average if user does 3 or 4 searches we end
up with over 1mil Unique strings constructed from the Search.

Any hint on how to improve the implementation so it will give us better
insights?

Thanks in advance,

Jakub
 

Avatar

Correct answer by
Employee

Hi Jakub,

Due to the high cardinality of search variations and how you are merging several values there is not a workaround aside from capturing the data at a higher up summarized level or splitting apart your string. Right now how the data is captured in one combined string will causally yield a high low traffic % in Reports & Analytics and is only viable for analysis in your raw clickstream in a big data type effort.

Best,

Brian