Expand my Community achievements bar.

Campaign Tracking Classification Rule Builder - How to ignore "extra" parameters

Avatar

Level 1

Hello,

 

I’m working on setting up the Classification Rule Builder for Campaign Tracking in Adobe Analytics and need some assistance with handling extra parameters in our tracking URLs.

 

In some cases, our tracking URLs have additional parameters appended by default (e.g., gclid in paid search ads). I want to ensure that our classification rules can handle up to five parameters correctly while ignoring any extra parameters that may appear beyond these five.

 

Objective:

  1. For URLs with exactly five parameters (e.g., source|medium|name|content|term): I want to match and capture all five parameters correctly.

  2. For URLs with more than five parameters (e.g., source|medium|name|content|term|extra1|extra2): I want to capture only the first five parameters and ignore the rest, ensuring that the classification remains accurate and does not break.

I am using the following regular expression pattern for objective number 1: ^(.*)\|(.*)\|(.*)\|(.*)\|(.*)$

How should I set it up to ignore additional parameters?

 

Thank you!

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

3 Replies

Avatar

Community Advisor and Adobe Champion

Hi @giar1 ,

I know I have not always been the best with regex.  However, there is this awesome post by one of my fellow Adobe Champions (@Andrew_Wathen_ ), that I believe may help you out.

https://experienceleaguecommunities.adobe.com/t5/adobe-analytics-blogs/supercharge-your-classificati... 

Best of luck!

Jeff Bloomer

Avatar

Community Advisor

Hi @giar1 

I would assume something like this could be what you're looking for

 

 

^(\w*)\|(\w*)\|(\w*)\|(\w*)\|(\w*)

 

 

bjoern__koth_0-1721748681666.png

 

bjoern__koth_0-1721748544573.png

bjoern__koth_1-1721748553238.png

You can try it here https://regex101.com/r/PYSncQ/1 . Obviously, I would recommend to test some more sophisticated sample data.

See screenshot above where the meaning of "\w" is explained in a regex context.

Should this regex class be not enough, you can always extend the regex with additional supported characters e.g.,

// to allow a colon/: character, you will need to wrap
// the \w in square brackets in which you can list any other character you need
...([\w:]*)...

 

Cheers

 

Cheers from Switzerland!


Avatar

Community Advisor and Adobe Champion

The first thing I would like to ask is that you are sure that all your core "5" parameters are always provided.. that there is no chance of "column slippage"?

 

By that I mean, you have "source|medium|name|content|term"

 

What happens if "content" is not provided, do you have:

  • source|medium|name||term  (empty value between your pipes)
  • source|medium|name|na|term (placeholder text of na or none, etc)
  • source|medium|name|term  (or is it missing complete, making your potential regex subject to failures??   (FYI, this scenario is very bad, one because if you are expecting 5 and only get 4, your regex could fail, or with the potential for "|extras" term might fall in as content, and the first extra as your term....)

 

As it stands, your current regex is looking for "any" character, so your extras are still coming in.... 

 

Assuming you are only expecting alphanumeric characters (A-Z or 0-9, nothing else), you can try using:

 

([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)

 

Instead of the * which will also include your pipes (|), this will look for specific text, and only grab up to the first 5 slots (note that I also removed the start and end of line designations)

 

Testing this on https://www.regextester.com/

Jennifer_Dungan_3-1721748460244.png

You can see how each group is pulled, nothing crossing over across the pipes.... and it's only getting up to the term.

 

 

Your original rule on the other hand, grabs everything until the end, collapsing multiple items across the pipe:

Jennifer_Dungan_2-1721748429326.png

 

 

 

You may need to tweak the rules a bit to account for dashes or underscores, note the dash should be added to each group first like so (if needed), so that the regex rule doesn't think it's a range (or you can escape it):

 

([-_a-zA-Z0-9]*)

([a-zA-Z0-9\-_]*)