Campaign Tracking Classification Rule Builder - How to ignore "extra" parameters | Community
Skip to main content
July 23, 2024
Question

Campaign Tracking Classification Rule Builder - How to ignore "extra" parameters

  • July 23, 2024
  • 3 replies
  • 781 views

Hello,

 

I’m working on setting up the Classification Rule Builder for Campaign Tracking in Adobe Analytics and need some assistance with handling extra parameters in our tracking URLs.

 

In some cases, our tracking URLs have additional parameters appended by default (e.g., gclid in paid search ads). I want to ensure that our classification rules can handle up to five parameters correctly while ignoring any extra parameters that may appear beyond these five.

 

Objective:

  1. For URLs with exactly five parameters (e.g., source|medium|name|content|term): I want to match and capture all five parameters correctly.

  2. For URLs with more than five parameters (e.g., source|medium|name|content|term|extra1|extra2): I want to capture only the first five parameters and ignore the rest, ensuring that the classification remains accurate and does not break.

I am using the following regular expression pattern for objective number 1: ^(.*)\|(.*)\|(.*)\|(.*)\|(.*)$

How should I set it up to ignore additional parameters?

 

Thank you!

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

3 replies

jeff_bloomer
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
July 23, 2024

Hi @r-b1 ,

I know I have not always been the best with regex.  However, there is this awesome post by one of my fellow Adobe Champions (@andrew_wathen_ ), that I believe may help you out.

https://experienceleaguecommunities.adobe.com/t5/adobe-analytics-blogs/supercharge-your-classification-rule-builder-in-adobe-analytics/ba-p/448153 

Best of luck!

bjoern__koth
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
July 23, 2024

Hi @r-b1 

I would assume something like this could be what you're looking for

 

 

^(\w*)\|(\w*)\|(\w*)\|(\w*)\|(\w*)

 

 

 

You can try it here https://regex101.com/r/PYSncQ/1 . Obviously, I would recommend to test some more sophisticated sample data.

See screenshot above where the meaning of "\w" is explained in a regex context.

Should this regex class be not enough, you can always extend the regex with additional supported characters e.g.,

// to allow a colon/: character, you will need to wrap // the \w in square brackets in which you can list any other character you need ...([\w:]*)...

 

Cheers

 

Cheers from Switzerland!
Jennifer_Dungan
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
July 23, 2024

The first thing I would like to ask is that you are sure that all your core "5" parameters are always provided.. that there is no chance of "column slippage"?

 

By that I mean, you have "source|medium|name|content|term"

 

What happens if "content" is not provided, do you have:

  • source|medium|name||term  (empty value between your pipes)
  • source|medium|name|na|term (placeholder text of na or none, etc)
  • source|medium|name|term  (or is it missing complete, making your potential regex subject to failures??   (FYI, this scenario is very bad, one because if you are expecting 5 and only get 4, your regex could fail, or with the potential for "|extras" term might fall in as content, and the first extra as your term....)

 

As it stands, your current regex is looking for "any" character, so your extras are still coming in.... 

 

Assuming you are only expecting alphanumeric characters (A-Z or 0-9, nothing else), you can try using:

 

([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)\|([a-zA-Z0-9]*)

 

Instead of the * which will also include your pipes (|), this will look for specific text, and only grab up to the first 5 slots (note that I also removed the start and end of line designations)

 

Testing this on https://www.regextester.com/

You can see how each group is pulled, nothing crossing over across the pipes.... and it's only getting up to the term.

 

 

Your original rule on the other hand, grabs everything until the end, collapsing multiple items across the pipe:

 

 

 

You may need to tweak the rules a bit to account for dashes or underscores, note the dash should be added to each group first like so (if needed), so that the regex rule doesn't think it's a range (or you can escape it):

 

([-_a-zA-Z0-9]*)

([a-zA-Z0-9\-_]*)