Expand my Community achievements bar.

Submissions are now open for the 2026 Adobe Experience Maker Awards
SOLVED

Handling Page URL Cardinality in Workspace at Scale

Avatar

Level 2

Has anyone else struggled with the Page URL dimension hitting cardinality limits in Workspace? On heavier traffic months, I’m seeing 10–15% of URLs rolled into Low Traffic, which makes the stakeholder dashboards pretty unreliable.

I know Adobe generally recommends leaning on Page Name or classifications, but that doesn’t always cut it when the request is for an exact URL list. We’ve tried a few things:

  • Classifying off a clean URL key (helpful but introduces latency + governance overhead).

  • Segmenting by page categories (good directional view, but not enough detail for stakeholder asks).

  • Falling back to Data Warehouse pulls (accurate but not self-service friendly).

Curious how other enterprise teams are dealing with this:

  1. Any practical ways you’ve kept URL-level visibility in Workspace without blowing up cardinality?

  2. Are there approaches you’ve found that give business users “exact URL” access without sending them to Data Warehouse every time?

  3. And has anyone leaned on CJA as a workaround here — does it actually solve the long-tail URL reporting pain?

Would love to hear what’s working (or not) for others. 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @jarrell43 

sounds like you are capturing URLs including all query and hash values which, in certain cases lead to low traffic if these are calculated on a visitor or session base. The reporting benefits are questionable I'd say.

Depending on your URL structure I would recommend to

  • only capture the domain and pathname and strip all query params and hashes
  • should there be a dynamic part in the pathname like /path/to/<user id>/pageOrDocument, replace it with something generic like "user id" or "document id"

The simplest way to get a clean URL would be a custom code data element in your tag manager that returns.

 

return document.URL.split("?")[0].split("#")[0];

 

This should give you a high level consolidation of all of your unique URLs.

 

@Jennifer_Dungan afaik CJA does not have this limitation anymore.

 

 

 

 

Cheers from Switzerland!


View solution in original post

5 Replies

Avatar

Community Advisor and Adobe Champion

As far as I am aware, CJA would have the same sort of "low traffic" buckets.

 

I am curious, how many URLs are you dealing with and what sort of reports would require needing to see every single URL directly out of Adobe?

 

Even the Adobe Workspaces will only show up to 400 values at a time.. and the export from Workspaces will only support up to 50,000 values at a time. Unless you are using a Segment of "Section A" then show me all URLs and that list is a manageable amount to view in a report (but limited by Low Traffic)... You can also use Data Warehouse pulls, but that has other challenges of not being able to de-duplicate metrics like UVs or Visits....

 

Honestly, we do very little reporting using the URLs. When we do use URLs is when we are digging into potential issues or doing spot checks.. so we just stick with URLs we do have, and if the issue is all "low traffic" then we use our data lake to help augment the investigation...

 

Essentially, this has rarely been a problem for us... but I would like to know more about your needs to see if I can make any recommendations.

Avatar

Level 2

Thanks for the perspective, as always. We’re usually dealing with a few million URLs in a month. On heavier traffic months, about 10–15% fall into Low Traffic. The main pain is stakeholder dashboards where people want the full URL list instead of just page names or roll-ups.

We do use Data Warehouse for deep dives, but like you said, that creates issues with UVs/Visits and isn’t something most business users will pull for themselves. Workspace is still where most of the day-to-day happens, which is why the limit is tough.

Avatar

Community Advisor and Adobe Champion

Yeah, the limits are always going to be a bit of a challenge...

 

We track both the full URL (with query params) and the Canonical URLs... I was also considering tracking the URL without params (but haven't had a big need for that yet). Canonical almost does the trick, but we have content that is shared across different sites, the canonical points only to the main owner of the content (but I can still break it down by what server loaded the content)

Avatar

Correct answer by
Community Advisor

Hi @jarrell43 

sounds like you are capturing URLs including all query and hash values which, in certain cases lead to low traffic if these are calculated on a visitor or session base. The reporting benefits are questionable I'd say.

Depending on your URL structure I would recommend to

  • only capture the domain and pathname and strip all query params and hashes
  • should there be a dynamic part in the pathname like /path/to/<user id>/pageOrDocument, replace it with something generic like "user id" or "document id"

The simplest way to get a clean URL would be a custom code data element in your tag manager that returns.

 

return document.URL.split("?")[0].split("#")[0];

 

This should give you a high level consolidation of all of your unique URLs.

 

@Jennifer_Dungan afaik CJA does not have this limitation anymore.

 

 

 

 

Cheers from Switzerland!


Avatar

Level 2

Good call on stripping query strings and hashes — that does help reduce noise! Where we run into trouble is when those params hold the key info (article IDs, form IDs, campaign tags). Stakeholders still ask for URL-level reporting that shows them.

I like the idea of replacing dynamic path elements with placeholders. That could help us find a middle ground between clean reporting and the detail that matters.

Also exploring if web dev can expose those identifiers somewhere without creating the cardinality problem...

Thank you both!