Adobe Analytics

AMinakov · 8/24/22

Hello everyone !

I have implemented new bucketed dimension using Traffic Variable Classification.

In most cases it works perfect, but some values get into 'Unspecified' in new classified dimension, while I expect them to get in the another value. It isn't a problem with the regular expressions because the same value like '1' can be in the '1 - 10' item and in the Unspecified.

Maybe there are any limits on the count of values in the prop, I have near 8-10 millions Occurances per day.
Please write if you have any ideas on the reasons of that.

VaniBhemarasetty · 8/24/22

@AMinakov Look back window would be one of the reasons, for this. Below are some points you would need to keep in mind

Sub-classifications are not supported with Classification Rule Builder (CRB).
Our current classification system can only export up to 10 million rows at a time.
When CRB requests an export, it pulls both classified AND unclassified values, with unclassified values coming through at the end of the export. This means that, over time, you could fill up 10 million classified values - without ever getting to the unclassified values.
Because the architecture is set up in a way that CRB could be pulling from “n” number of servers, this can lead to inconsistencies as to which servers get picked up and in what order. For that reason, it is very difficult to get to unclassified values.

This is the workaround for those who have more than 10 million classified values for a dimension: You will need to export unclassified values via FTP, in 10-million batches, and manually classify them.

View solution in original post

Pablo_Childe · 8/24/22

First question how are you classifying things.

Using classification rule builder or Uploading files to be classified?

AMinakov · 8/24/22

Pablo, thank you for your attention !
I used Classification Rule Builder.

Pablo_Childe · 8/24/22

I am not aware of any limitation by sheer numbers processed.

I suspect the sheer volume you encounter is just having instances/variations that though they seem to should fall into one classification are somehow being processed as unspecified.

Could there be some logic that when so many are sent they are getting corrupted so that some have extra characters in them?

Tough to understand without examples and without seeing classification logic. I do know I have had to re adjust my classification regex from time to time as some new variations can be acting as you describe.

yuhuisg · 8/24/22

Do the "Unspecified" ones appear when looking at the current day's data only, or with old data too? If old data, how far back do "Unspecified" appear?

AMinakov · 8/24/22

The current data too. I thought about lookback window too, but the problem most likely caused by the big data, because with the time % of Unspecified decrease.

VaniBhemarasetty · 8/24/22

@AMinakov Look back window would be one of the reasons, for this. Below are some points you would need to keep in mind

Sub-classifications are not supported with Classification Rule Builder (CRB).
Our current classification system can only export up to 10 million rows at a time.
When CRB requests an export, it pulls both classified AND unclassified values, with unclassified values coming through at the end of the export. This means that, over time, you could fill up 10 million classified values - without ever getting to the unclassified values.
Because the architecture is set up in a way that CRB could be pulling from “n” number of servers, this can lead to inconsistencies as to which servers get picked up and in what order. For that reason, it is very difficult to get to unclassified values.

This is the workaround for those who have more than 10 million classified values for a dimension: You will need to export unclassified values via FTP, in 10-million batches, and manually classify them.

Adobe Analytics

Classification Rule Limits

Learn

Documentation

Events

Community

Support

Resources

Adobe account

Adobe