Expand my Community achievements bar.

Join us for the next Community Q&A Coffee Break on Tuesday April 23, 2024 with Eric Matisoff, Principal Evangelist, Analytics & Data Science, who will join us to discuss all the big news and announcements from Summit 2024!
SOLVED

Regular Expressions and Numbers in Strings

Avatar

Community Advisor

Looking for someone who has had experience in using Regular Expressions in the Classification Rule Builder.

We have an eVar that is collecting the number of search results in this fashion:

<Total Results>_<# of Item 1>_<# of Item 2>_<# of Item 3>_<# of Item 4>

Example output would look like this:

150_50_0_25_75

What we've done is initially create a Regular Expression that looks like this:

^(.+)\_(.+)\_(.+)\_(.+)\_(.+)$

The problem is, it appears in situations where the output contains a zero in one of the slots, the value is ignored and it receives the value in the next place over.  Using the example output shown above, I would end up with values like this:

$0 150_50_0_25_75

$1 150

$2 50

$3 25

$4 75

$5 {null}

Here's the weird part.  When I perform a test of a single record, it appears like it will work just fine, but when it actually runs in Omniture, it's not working as expected.  Here's something else I'd like to know if it's possible to address.  The five-place string is only the newest iteration of this approach.  In the past, we started out with a two-place version, then three-place and then four.  Any recommendations for handling all scenarios?

Any and all advice is welcome.  Thanks in advance!

[Moved from the general Adobe Experience Cloud forum into an Adobe Analytics forum - moderator]

Jeff Bloomer
1 Accepted Solution

Avatar

Correct answer by
Community Advisor

I know this thread ended up under the wrong group.  If a moderator at Adobe has the ability to move it under Adobe Analytics, that would be great.  Meanwhile, seeing this was never answered, I wanted to come back and provide the learnings I had a little later.

Using Regular Expressions to manage Classification Rules has two parts:

  1. The proper expression to capture the appropriate data
  2. Setting a specific classification for each element of the string.

For example:

I have an eVar that captures number of search results for different items on a search page with output that looks something like this:

624_5_5_0_614

Here's how the data classification breaks out:

  1. Total Results
  2. Number of Articles
  3. Number of Coupons
  4. Number of Recipes
  5. Number of Products

For me to properly capture all data for Number of Recipes and align any 0 values in the Classification rule, I have to create the following expressions in order for it to work:

Match CriteriaSet ClassificationTo
^(\d+)\_(\d+)\_(\d+)\_(\d+)\_(\d+)$Number of Recipes$4
^(\d+)\_(\d+)\_(\d+)\_(0)\_(\d+)$Number of Recipeszero

The reason for the second line shown above is to handle the instance of a zero in the string for Number of Recipes.  If I only used the first rule to assign all values for Number of Recipes, the rule encounters an error when it see's the zero and proceeds to the next value in the string, which means a value of 614 would end up in Number of Recipes.  The second rule accommodates for the existence of a 0 in the string and converts it to a text string the Classification rule understands, and now the data aligns where it should each time.

Ultimately, because we had a results string that changed length as we added more possibilities for results, I ended up creating a total of 30 match criteria in this specific classification rule.  Hope this helps anyone else who has a similar issue.  Best wishes!

Jeff Bloomer

View solution in original post

6 Replies

Avatar

Community Advisor

Doing some playing around on rubular.com and thinking the Regular Expression should be build this way instead:

^(\d+)\_(\d+)\_(\d+)\_(\d+)\_(\d+)$

Again, still looking for any additional guidance from more experienced individuals.  Thanks!

Jeff Bloomer

Avatar

Community Advisor

I may have ended up figuring this out, but I am definitely still interested in any feedback.  Thanks!

Jeff Bloomer

Avatar

Level 1

This is to do with Adobe Marketing Cloud, correct?

These forums are for Adobe Business Catalyst.

Avatar

Community Advisor

Apologies.  I got here via a circuitous route.  I will redirect as appropriate.

Jeff Bloomer

Avatar

Employee

Hi Jeff,

I am not sure if this question has been answered as you'd expected. I had a similar requirement where the values were split into four fields and any one or more field could be blank.

I coined this expression with the help of a developer from stackoverflow.com. Maybe it might help you or someone else looking for an answer to this common request

Here is the same regex in action on Rubular.com. Hope it helps.

Rubular: ^(.*?)\|(.*?)\|(.*?)\|(.*)$

Avatar

Correct answer by
Community Advisor

I know this thread ended up under the wrong group.  If a moderator at Adobe has the ability to move it under Adobe Analytics, that would be great.  Meanwhile, seeing this was never answered, I wanted to come back and provide the learnings I had a little later.

Using Regular Expressions to manage Classification Rules has two parts:

  1. The proper expression to capture the appropriate data
  2. Setting a specific classification for each element of the string.

For example:

I have an eVar that captures number of search results for different items on a search page with output that looks something like this:

624_5_5_0_614

Here's how the data classification breaks out:

  1. Total Results
  2. Number of Articles
  3. Number of Coupons
  4. Number of Recipes
  5. Number of Products

For me to properly capture all data for Number of Recipes and align any 0 values in the Classification rule, I have to create the following expressions in order for it to work:

Match CriteriaSet ClassificationTo
^(\d+)\_(\d+)\_(\d+)\_(\d+)\_(\d+)$Number of Recipes$4
^(\d+)\_(\d+)\_(\d+)\_(0)\_(\d+)$Number of Recipeszero

The reason for the second line shown above is to handle the instance of a zero in the string for Number of Recipes.  If I only used the first rule to assign all values for Number of Recipes, the rule encounters an error when it see's the zero and proceeds to the next value in the string, which means a value of 614 would end up in Number of Recipes.  The second rule accommodates for the existence of a 0 in the string and converts it to a text string the Classification rule understands, and now the data aligns where it should each time.

Ultimately, because we had a results string that changed length as we added more possibilities for results, I ended up creating a total of 30 match criteria in this specific classification rule.  Hope this helps anyone else who has a similar issue.  Best wishes!

Jeff Bloomer