Counting Entries from Raw Data | Community
Skip to main content
April 14, 2021
Solved

Counting Entries from Raw Data

  • April 14, 2021
  • 1 reply
  • 1764 views

I am trying to recreate the Previous Page report using raw data. My calculations are coming back less than what is shown in Reports. For example, I am defining "Entered Site" as where the post_pagename is the target page, and visit_page_num = 1. This leaves me 8 fewer than expected over my 5 day period. I am capturing all of the Page Views, but less are being assigned "Entered Site" than what I see in Reports. My best guess is I am missing site entries that happen in the middle of a visit, such as someone exiting the website, then directly navigating back to the target page before the session expires.

 

I am looking for how site entries and previous pages are defined by the Analytics tool so I can recreate it with HiveQL. Are there other columns used to calculated Entries and previous pages? Thank you.

 

Example query:

 

with renamed_pagenames as ( select case when (post_pagename like "oao: your info: welcome" and visit_page_num = 1) then "Entered Site" else post_pagename end as my_pagename from mktpi2.adobe_stg_un_hit_data where load_date between "2021-01-01" and "2021-01-05" and exclude_hit like "0" and hit_source not in (5,8,9) and post_pagename = "oao: your info: welcome") select my_pagename, count(my_pagename) from renamed_pagenames group by my_pagename limit 100;

 

 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by David_Jerome

@dillangump- I think this might be something really simple.

and hit_source not in (5,8,9)

should be

and hit_source not in (5,7,8,9)

Once I did that (and renamed the page and database) your query ran fine for me.

1 reply

David_Jerome
David_JeromeAccepted solution
April 15, 2021

@dillangump- I think this might be something really simple.

and hit_source not in (5,8,9)

should be

and hit_source not in (5,7,8,9)

Once I did that (and renamed the page and database) your query ran fine for me.

David_Jerome
April 16, 2021
Curious - what sort of difference are you seeing as a percentage across 5 days?