Expand my Community achievements bar.

Submissions are now open for the 2026 Adobe Experience Maker Awards

Duplicate pageview removal via normalization segment

Avatar

Level 1

Hi team,

 

I’ve learned from my colleagues  that one of the months in my historical Adobe Analytics data contains duplicate pageviews due to an incorrect s.t() implementation.

The issue was fixed long time ago, but I’m wondering if there’s a way to deduplicate these pageviews for a specific time period using a segmentation approach.


I was considering a general rule like:
time.period is ="date“ AND include hits where the page exists and does not equal the previous page”
but I’m not sure how to implement this so that it applies to all pages (not just a specific one).


Any thoughts or recommendations?

Thank you!

7 Replies

Avatar

Community Advisor

Hi @kayzee 

Good question, deduplicating pageviews from historical data isn’t always straightforward, but you’re on the right track.

Since Adobe Analytics doesn’t natively support row-level de-duplication like SQL would, your best bet is to create a hit-level segment that filters out consecutive duplicate page views and yes, this can be done without listing each page manually.

Here’s a general approach:

  1. Segment Type: Hit-level

  2. Logic:

    • Include hits where Page exists

    • AND Page **does not equal Previous Page`

This works because Previous Page is a dimension that lets you compare each hit to the last. So for users who hit the same page multiple times in a row (due to faulty s.t() calls), this will exclude all but the first instance.

To restrict it to the specific time period where the bug occurred, wrap it with:

  • Visit Start Date is within your known issue month (or use a custom date range in your workspace)

One thing to keep in mind that this approach will remove legitimate back-to-back views of the same page as well. So depending on how critical precision is, you may want to isolate this to just the known affected dates and do some validation before using it widely.

Hope this works!

Avatar

Level 1

Hi Vinay,

 

how do you exactly build this piece of logic in the Segment builder UI?

  • AND Page **does not equal Previous Page`

 

Thank you

Avatar

Community Advisor

Hi @kayzee 

Unfortunately, Adobe’s Segment Builder doesn’t allow a direct comparison between “Page” and “Previous Page” in a conditional statement (like “Page ≠ Previous Page”), it’s not like SQL where you can do row-to-row comparisons dynamically.

However, here’s a workaround that approximates the same logic -

Option 1: Build a Segment That Excludes Obvious Duplicates

If your duplicate issue was caused by the same page firing twice in a row, try this -

  1. In Segment Builder, set to Hit-level.

  2. Drag in the Page dimension and use -

    • “exists” (or "is not empty")

  3. Then add a Sequential Condition to exclude hits where the same page is repeated:

    • Step 1: Page = any (you can leave this generic)

    • THEN immediately followed by: Page = same page again

This isn’t perfect but can help identify or isolate sessions where duplicate hits are likely happening. You can then refine further by adding a date constraint (like “Visit Start Date is in March 2024”) to target only the impacted time frame.

Option 2: Export and Clean Up in Excel or SQL (if precision matters)

Since Segment Builder isn’t built for dynamic field comparisons, your cleanest method for true deduplication is to export the raw data (Data Feed or Data Warehouse) and filter programmatically using Page ≠ Previous Page logic in Excel, SQL, or Python.

Let me know if it works.

Avatar

Level 1

Hi @Vinay_Chauhan 

 

could you please show a screenshot how to build step 3 in your Option 1 exactly.

 

Thank you  

Avatar

Level 3

Hi @kayzee ,

 

It is quite challenging to handle such scenarios directly within Workspace.

Recently, I encountered a similar issue in my application—though in my case, pageviews were being inflated due to a different bug, not just duplicate calls.

Normalising duplicate page view calls can be achieved in Workspace using workarounds and assumptions. However, for this to be reliable, we need to be 100% certain that the duplicate pageviews occurred consistently for all users within a known date range. Only then is it reasonably safe to divide the pageviews for the affected page(s) by 2.

 

In essence, the approach I used was similar to this SQL-style logic:

ROW_SUM(ADJ_PAGEVIEW_1, ADJ_PAGEVIEW_2)

 

 

  • ADJ_PAGEVIEW_1: Pageviews divided by 2 to account for the duplication on the specific page(s).

  • ADJ_PAGEVIEW_2: Actual pageview count, excluding the affected page(s) during the impacted time period.

By combining segments and calculated metrics creatively, I was able to approximate a more accurate trend.

Example:

I tested the approach by adjusting the pageviews with a factor of 0.5 (i.e., halving them), and it seems to be working as expected in my case.

This workaround helps in analyzing trends and reporting only when the duplicate behavior is uniform and predictable. If the duplication varies—say, some users get 2 duplicate calls while others get 3 or more—then this approach will not yield accurate numbers.

Screenshot:

nitesh__anwani_1-1753165605245.pngnitesh__anwani_2-1753165611780.png

 

nitesh__anwani_3-1753165627224.png

 

nitesh__anwani_4-1753165634883.png

 

I discovered the ROW_SUM function from the recently published Calculated Metric Playbook by @MandyGeorge .

 

Thanks,

Nitesh

 

Avatar

Community Advisor

Hi @kayzee ,

 

Adobe Analytics doesn’t offer a built-in way to directly compare the current page to the previous page across all hits unless you explicitly captured the previous page using getPreviousValue() in your implementation.

 

Implementation Options
Option A: Best Practice (with getPreviousValue)
Precondition: You have implemented getPreviousValue() and stored the previous page in an eVar or prop (e.g., eVar15 = previousPage).
 
Then your segment can be built like this:
In Segment Builder:
  • Container: Hit
  • Logic:
    • Page exists
    • Page does not equal eVar15 (your "previous page" variable)

Limit to Time Period:

Instead of putting "time.period = date" inside the segment, do this:

  • When you apply the segment in Workspace, filter the panel to your target date range (e.g., Jan 1–31, 2024).

  • Don’t hard-code date into the segment, so it's reusable.

 

Option B: Without Previous Page = Not Feasible for All Pages

If you don’t have a previous page variable captured, Adobe does not let you compare the current and previous hit directly (unlike SQL or CJA).

You might try a Sequential Segment, but it becomes page-specific. Example:

 

Hit 1: Page = “X”
THEN within same visit
Hit 2: Page = “X” → this is duplicate

 

So you could build a Sequential Segment like:

  • Step 1: Page equals X
  • THEN within 1 hit
  • Step 2: Page equals X again

But this would have to be built for every page separately — not scalable.

 

Recommendations: 

 

If you already have a previous page variable (e.g., eVar15):

Use this segment logic:

 

Container: HIT
AND
Page exists
AND
Page does not equal eVar15

 

Screenshot :


 

And apply the time period in the Workspace panel when pulling data

 

If Not Implemented Yet: Quick Fix for Future

Ask your dev team to implement getPreviousValue():

 

s.eVar15 = getPreviousValue(s.pageName, 'gpv_pn');

 

It uses Adobe's append-to-cookie pattern to store the previous value per session.

 

Screenshot of segment:

pradnya_balvir_1-1753184012408.png

 

Thanks.

Pradnya

Avatar

Community Advisor and Adobe Champion

Another option, would be to use a segment that looks at the specific date range (and potentially only the platform where the duplication exists), and excluding hits that also have the "Reloads" metric...

 

Basically, Adobe identifies "reloads" by looking at some of the key dimensions being identical (like page name, url, and a few other fields)....

 

In the scenario you are describing, the second s.t() call should be treated like a reload... this will mean you will be throwing out real reloads, which if you really want to get creative, you can figure out your average reload rate, and then "add back the average" for the date range.

 

This would be able to get you a pretty realistic right-sizing....