I am updating an existing implementation that uses a colon delimiter. One of the fields we have is a text field which captures the texts of various items some of which are article names. The issue is some article titles have a colon which is throwing off the existing rule.
Here is the expression I am using.
(?i)^(.*?):(.*):(.*):(.*)(?-i)
What I am trying to achieve is the following:
$0Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:Undefined
To be: (note the colon on $3)
Solved! Go to Solution.
Views
Replies
Total Likes
Hi,
I am not sure what the (?i) at the beginning or (?-i) at the end are supposed to be doing in your Regex... when I used this in an online regex tester, those both came back as invalid.
However, using the string
"Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:"
with this regex:
^(.*?):(.*?):(.*):(.*)$
results in:
Basically, (.*?) in the regex means "match any character except line breaks" / "match 0 or more characters" / "lazy qualifier - match as little as possible"
Which in your original regex (minus the odd things), meant that the first instance of colon in your string would force the extracted group to stop...
But on the next part (Link), it wasn't using the lazy designation... and because you actually had 1 too many colons in the text, this group took on the extra values.... I just made the second group also lazy, so that the break would occur after "Link", but the next part (the article title with the extra characters) should now take on ALL extra colons until the last one in the string, which will then be your part 4.
I hope this helps
Hi,
I am not sure what the (?i) at the beginning or (?-i) at the end are supposed to be doing in your Regex... when I used this in an online regex tester, those both came back as invalid.
However, using the string
"Trending Articles 123:Link:Emerging Issues Executive Quarterly: Insurance in the Inflationary Era:"
with this regex:
^(.*?):(.*?):(.*):(.*)$
results in:
Basically, (.*?) in the regex means "match any character except line breaks" / "match 0 or more characters" / "lazy qualifier - match as little as possible"
Which in your original regex (minus the odd things), meant that the first instance of colon in your string would force the extracted group to stop...
But on the next part (Link), it wasn't using the lazy designation... and because you actually had 1 too many colons in the text, this group took on the extra values.... I just made the second group also lazy, so that the break would occur after "Link", but the next part (the article title with the extra characters) should now take on ALL extra colons until the last one in the string, which will then be your part 4.
I hope this helps
Not only did this help, it also addressed some issues that I had not yet discovered. Kudos!
So glad that helped you.
FYI, fort he record, I like to use http://www.regextester.com/ to test all my regex rules before I even start building in Adobe's Rule Builder...
This allows me to test multiple examples (you have to turn on multi-line) to see how the rule is shaping up.. then you can build the rule with the one sample, confirm the groups that Adobe shows, and then do the final Test Rules with multiple samples again.
Views
Replies
Total Likes