Expand my Community achievements bar.

Join us January 15th for an AMA with Champion Achaia Walton, who will be talking about her article on Event-Based Reporting and Measuring Content Groups!

How to calculate reload using data feed

Avatar

Level 1

I refer to the definition in the documentation: 

- The ‘Reloads’ metric shows the number of times a dimension item was present during a reload. A visitor refreshing their browser is the most common way to trigger a reload.

- This metric counts the number of hits where the Page dimension contains the same value as the previous hit.

 

So I use window function to calculate it: 
LEAD(post_pagename, 1) OVER(PARTITION BY visid ORDER BY visit_page_num) as next_pagename

 

And then, if next_pagename = post_pagename, I mark this record as reload. But I got about 5 times larger the number than it's shown using the metric reload on the dashboard. Why? 

2 Replies

Avatar

Community Advisor and Adobe Champion

Silly question, but in your calculation I assume you are taking into account the visitor identification, and only looking at the "page view rows" (and not actions performed on the same page - I think post_pagename should cover this)? And I assume you are also taking into account the exclude_hit to make sure you are dealing with only data that should be included?

 

The current documentation only mentioned if the page dimension contains the same value, but a few years ago I am sure it mentioned a combination of dimensions... but that particular document no longer seems to exist... but I was sure there was some extra checks to try and weed out inflated counts for when the implementation was passing the same name... I am sure that the pageURL was one of those other fields.. that both the name and URL had to be repeated from the previous hit.....

 

You could give that a try and see if anything improves?

 

One potential issue with just the page name could be pages like search results... where a page num parameter would change as the user goes through search results, but the implementation might not change the page name? However, that only would work if the pageURL contains query parameters (and doesn't truncate them)

 

 

Our data engineering team hasn't tried to replicate the reloads, so I can't even check with them... they focus more on things we can't do within Adobe (like stitch our analytics data to other sources).

 

Good luck!

Avatar

Level 1

Hi Jennifer,

 

Thank you for replying! 

 

1. Yes. I use post_visid_high || post_visid_low || visit_num || visit_start_time_gmt as visid to represent each visit session. I also use exclude_hit = 0 and hit_source NOT IN the four sources. My sessions, pageviews, time spent, and bounces are all matching so I think I am using the correct approach. 

 

2. I think the current post_pagename include the post_page_url so I didn't compare post_page_url with the next hit, referring to the below documentation: 

pagenameThe Page dimension. If the pagename variable is empty, Analytics uses page_url instead.varchar(100)
pagename_no_urlSimilar to pagename, except it does not fall back to page_url. Only the post column is available.varchar(100)

 

3. An update: I use page_name != '' to ensure that I only captured pageviews. This significantly reduces the reload amount. So my current logic becomes: 

 

- CTE1: LEAD(post_pagename, 1) OVER(PARTITION BY visid ORDER BY visit_page_num) as next_pagename

- CTE2: If post_pagename = next_pagename and post_pagename!= '' then mark the record as reload. 

However, it's still not matching. We see a 1000+ difference for each day for some reason. Do you have any other recommendation?