Expand my Community achievements bar.

Join us January 15th for an AMA with Champion Achaia Walton, who will be talking about her article on Event-Based Reporting and Measuring Content Groups!
SOLVED

SAINT Campaign Tracking Codes (cid, cmpid, camp) - Alternate regex friendly format

Avatar

Level 1

Hi All,

I've recently joined an organisation who are using reasonably complex campaign tracking codes, for example:

?cid=paid__GEN_BI_BRAN_CARE_DITO_INS_GEN_EX_20186_WA_GEN__p13836758146

The format of this code contains 13 regex match groups, delimited by 12 underscores (_). The issue I've found is this code is often full of mistakes, with additional underscores being added into match groups, matching groups containing no information, or matching groups being absent entirely.

The regex in Adobe's Classification Rule Builder is as follows:

paid__([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)__p(\d{11})

which at first glance I find difficult to understand. Also, because it's relying on exactly 13 regex match groups delimited by 12 underscores, it will not match if the tracking code is different to this format.

URL Query String Alternative

I want to change the tracking code so it's both mistake resistant and regex friendly. The idea is to make a "pseudo query string" format, like you'd normally get on a URL (e.g ?name=ferret&color=purple). For example:

?cid=paid_1:GEN_2:BI_3:BRAN_4:CARE_5:DITO_6:INS_7:GEN_8:EX_9:20186_10:WA_11:GEN_12:13836758146

The regex in Adobe's Classification Rule Builder can then be setup as follows:

paid.*_1:([^_]+)(_)?.*

To match different parameters from the string, all you need to do is change the number (highlighted in red). In the rule builder the replace group will always equal $1. E.g.:

paid.*_1:([^_]+)(_)?.*    > $1 matches GEN

paid.*_2:([^_]+)(_)?.*    > $1 matches BI

paid.*_3:([^_]+)(_)?.*    > $1 matches BRAN

paid.*_4:([^_]+)(_)?.*    > $1 matches CARE

[...]

If a user enables a tracking code which doesn't exactly match the new format, the other parameters will still classify just fine. E.g.:

     ?cid=paid_1:GEN_2:BI

...will still match GEN and BI.

?cid=paid_1:GEN_2:BI_3:BRAN_4:CARE_5:DITO_6:INS_7:GEN_8:EX_9:20186_10:WA_11:GEN_12:13836758146_13:blah_14:moreblah

...will still match everything setup in your classification rules and ignore 13 and 14.

?cid=paid_1:GEN_2:BI_3:BRAN_4:CARE_5:DI_TO_6:INS_7:GEN_8:EX_9:20186_10:WA_11:GEN_12:13836758146

...will only break classification in 5, and not the entire key.

My question is can anybody see any issues with this? I've searched far and wide and haven't found anyone using this approach.

Kind Regards,

Brandon

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

Brandon, this looks like an amazing solution to your problem.

Eventually you may want to make sure all the data is coming correctly, but this solution beautifully incorporates a great safety factor.

Hyder

View solution in original post

1 Reply

Avatar

Correct answer by
Employee Advisor

Brandon, this looks like an amazing solution to your problem.

Eventually you may want to make sure all the data is coming correctly, but this solution beautifully incorporates a great safety factor.

Hyder