Adobe Analytics

Yohan_khan00 · 10/11/23

I am updating an existing implementation that uses a colon delimiter. One of the fields we have is a text field which captures the texts of various items some of which are article names. The issue is some article titles have a colon which is throwing off the existing rule.

Here is the expression I am using.

(?i)^(.*?):(.*):(.*):(.*)(?-i)

What I am trying to achieve is the following:

$0Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:Undefined

To be: (note the colon on $3)

$1Trending Articles 123

$2Link:

$3Emerging Issues Executive Quarterly: Insurance in the Inflationary Era

$4Undefined

It is currently returning:

$0Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:

$1Trending Articles 123

$2Link:Emerging Issues Executive Quarterly

$3 Insurance in the Inflationary Era

$4

any help to modify the regular expression is welcome - (?i)^(.*?):(.*):(.*):(.*)(?-i)

Jennifer_Dungan · 10/11/23

Hi,

I am not sure what the (?i) at the beginning or (?-i) at the end are supposed to be doing in your Regex... when I used this in an online regex tester, those both came back as invalid.

However, using the string

"Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:"

with this regex:

^(.*?):(.*?):(.*):(.*)$

results in:

$0 - Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:
$1 - Trending Articles 123
$2 - Link
$3 - Emerging Issues Executive Quarterly: Insurance in the Inflationary Era
$4 - ""

Basically, (.*?) in the regex means "match any character except line breaks" / "match 0 or more characters" / "lazy qualifier - match as little as possible"

Which in your original regex (minus the odd things), meant that the first instance of colon in your string would force the extracted group to stop...

But on the next part (Link), it wasn't using the lazy designation... and because you actually had 1 too many colons in the text, this group took on the extra values.... I just made the second group also lazy, so that the break would occur after "Link", but the next part (the article title with the extra characters) should now take on ALL extra colons until the last one in the string, which will then be your part 4.

I hope this helps

View solution in original post

Jennifer_Dungan · 10/11/23

Hi,

I am not sure what the (?i) at the beginning or (?-i) at the end are supposed to be doing in your Regex... when I used this in an online regex tester, those both came back as invalid.

However, using the string

"Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:"

with this regex:

^(.*?):(.*?):(.*):(.*)$

results in:

$0 - Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:
$1 - Trending Articles 123
$2 - Link
$3 - Emerging Issues Executive Quarterly: Insurance in the Inflationary Era
$4 - ""

Basically, (.*?) in the regex means "match any character except line breaks" / "match 0 or more characters" / "lazy qualifier - match as little as possible"

Which in your original regex (minus the odd things), meant that the first instance of colon in your string would force the extracted group to stop...

But on the next part (Link), it wasn't using the lazy designation... and because you actually had 1 too many colons in the text, this group took on the extra values.... I just made the second group also lazy, so that the break would occur after "Link", but the next part (the article title with the extra characters) should now take on ALL extra colons until the last one in the string, which will then be your part 4.

I hope this helps

Yohan_khan00 · 10/12/23

Not only did this help, it also addressed some issues that I had not yet discovered. Kudos!

Jennifer_Dungan · 10/12/23

So glad that helped you.

FYI, fort he record, I like to use http://www.regextester.com/ to test all my regex rules before I even start building in Adobe's Rule Builder...

This allows me to test multiple examples (you have to turn on multi-line) to see how the rule is shaping up.. then you can build the rule with the one sample, confirm the groups that Adobe shows, and then do the final Test Rules with multiple samples again.

Adobe Analytics

Classification Rule Builder - Regular Expression Help

Learn

Documentation

Events

Community

Support

Resources

Adobe account

Adobe