Classification Rule Limits | Community
Skip to main content
Level 2
August 24, 2022
Solved

Classification Rule Limits

  • August 24, 2022
  • 2 replies
  • 2062 views

Hello everyone !

I have implemented new bucketed dimension using Traffic Variable Classification.

In most cases it works perfect, but some values get into 'Unspecified' in new classified dimension, while I expect them to get in the another value. It isn't a problem with the regular expressions because the same value like '1' can be in the '1 - 10' item and in the Unspecified.

Maybe there are any limits on the count of values in the prop, I have near 8-10 millions Occurances per day.
Please write if you have any ideas on the reasons of that.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by VaniBhemarasetty

@aminakov Look back window would be one of the reasons, for this. Below are some points you would need to keep in mind

  • Sub-classifications are not supported with Classification Rule Builder (CRB).
  • Our current classification system can only export up to 10 million rows at a time.
  • When CRB requests an export, it pulls both classified AND unclassified values, with unclassified values coming through at the end of the export. This means that, over time, you could fill up 10 million classified values - without ever getting to the unclassified values.
  • Because the architecture is set up in a way that CRB could be pulling from “n” number of servers, this can lead to inconsistencies as to which servers get picked up and in what order. For that reason, it is very difficult to get to unclassified values.

This is the workaround for those who have more than 10 million classified values for a dimension: You will need to export unclassified values via FTP, in 10-million batches, and manually classify them.

2 replies

Pablo_Childe
Community Advisor
Community Advisor
August 24, 2022

First question how are you classifying things.

 

Using classification rule builder or Uploading files to be classified?

AMinakovAuthor
Level 2
August 24, 2022

Pablo, thank you for your attention !
I used Classification Rule Builder.

Pablo_Childe
Community Advisor
Community Advisor
August 24, 2022

I am not aware of any limitation by sheer numbers processed.

 

I suspect the sheer volume you encounter is just having instances/variations that though they seem to should fall into one classification are somehow being processed as unspecified.

 

Could there be some logic that when so many are sent they are getting corrupted so that some have extra characters in them?

 

Tough to understand without examples and without seeing classification logic. I do know I have had to re adjust my classification regex from time to time as some new variations can be acting as you describe. 

yuhuisg
Community Advisor
Community Advisor
August 24, 2022

Do the "Unspecified" ones appear when looking at the current day's data only, or with old data too? If old data, how far back do "Unspecified" appear?

AMinakovAuthor
Level 2
August 24, 2022

The current data too. I thought about lookback window too, but the problem most likely caused by the big data, because with the time % of Unspecified decrease.

VaniBhemarasetty
Adobe Employee
VaniBhemarasettyAdobe EmployeeAccepted solution
Adobe Employee
August 24, 2022

@aminakov Look back window would be one of the reasons, for this. Below are some points you would need to keep in mind

  • Sub-classifications are not supported with Classification Rule Builder (CRB).
  • Our current classification system can only export up to 10 million rows at a time.
  • When CRB requests an export, it pulls both classified AND unclassified values, with unclassified values coming through at the end of the export. This means that, over time, you could fill up 10 million classified values - without ever getting to the unclassified values.
  • Because the architecture is set up in a way that CRB could be pulling from “n” number of servers, this can lead to inconsistencies as to which servers get picked up and in what order. For that reason, it is very difficult to get to unclassified values.

This is the workaround for those who have more than 10 million classified values for a dimension: You will need to export unclassified values via FTP, in 10-million batches, and manually classify them.