Hi All,
I've recently joined an organisation who are using reasonably complex campaign tracking codes, for example:
?cid=paid__GEN_BI_BRAN_CARE_DITO_INS_GEN_EX_20186_WA_GEN__p13836758146
The format of this code contains 13 regex match groups, delimited by 12 underscores (_). The issue I've found is this code is often full of mistakes, with additional underscores being added into match groups, matching groups containing no information, or matching groups being absent entirely.
The regex in Adobe's Classification Rule Builder is as follows:
paid__([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)_([^_]+)__p(\d{11})
which at first glance I find difficult to understand. Also, because it's relying on exactly 13 regex match groups delimited by 12 underscores, it will not match if the tracking code is different to this format.
URL Query String Alternative
I want to change the tracking code so it's both mistake resistant and regex friendly. The idea is to make a "pseudo query string" format, like you'd normally get on a URL (e.g ?name=ferret&color=purple). For example:
?cid=paid_1:GEN_2:BI_3:BRAN_4:CARE_5:DITO_6:INS_7:GEN_8:EX_9:20186_10:WA_11:GEN_12:13836758146
The regex in Adobe's Classification Rule Builder can then be setup as follows:
paid.*_1:([^_]+)(_)?.*
To match different parameters from the string, all you need to do is change the number (highlighted in red). In the rule builder the replace group will always equal $1. E.g.:
paid.*_1:([^_]+)(_)?.* > $1 matches GEN
paid.*_2:([^_]+)(_)?.* > $1 matches BI
paid.*_3:([^_]+)(_)?.* > $1 matches BRAN
paid.*_4:([^_]+)(_)?.* > $1 matches CARE
[...]
If a user enables a tracking code which doesn't exactly match the new format, the other parameters will still classify just fine. E.g.:
?cid=paid_1:GEN_2:BI
...will still match GEN and BI.
?cid=paid_1:GEN_2:BI_3:BRAN_4:CARE_5:DITO_6:INS_7:GEN_8:EX_9:20186_10:WA_11:GEN_12:13836758146_13:blah_14:moreblah
...will still match everything setup in your classification rules and ignore 13 and 14.
?cid=paid_1:GEN_2:BI_3:BRAN_4:CARE_5:DI_TO_6:INS_7:GEN_8:EX_9:20186_10:WA_11:GEN_12:13836758146
...will only break classification in 5, and not the entire key.
My question is can anybody see any issues with this? I've searched far and wide and haven't found anyone using this approach.
Kind Regards,
Brandon