I am trying to query my data feed to exclude entry pages, but am coming up with a discrepancy and would appreciate any advice.
The result of the code below is 125229. This matches the hit depth dimension shown below.
select
count(distinct(
case when visit_page_num = 1 and post_evar4 like '%account.box%' then concat(post_visid_high,post_visid_low) end))
FROM `adobe.adobe_data`
where date_trunc(date_time,month) = '2025-02-01' and exclude_hit = 0
However, when I change the query to visit_page_num <> 1 (exclude entry), I get a value of 195904, which is very different from 105,959.
select
count(distinct(
case when visit_page_num <> 1 and post_evar4 like '%account.box%' then concat(post_visid_high,post_visid_low) end))
FROM `adobe.adobe_data`
where date_trunc(date_time,month) = '2025-02-01' and exclude_hit = 0
Account.box Segment = Contains 'account.box' (hit)
Entry Page <> account.box Segment = Exclude hit where entry page contains account.box
Can anyone explain why these numbers are different and if I should be considering another query?
Solved! Go to Solution.
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
Sorry for all of the replies. I found a solution by not using visit_page_num and instead replacing the logic with visit_start_pagename. This gave me a 0.3% difference:
select
count(distinct(
case when post_pagename like '%account.box%' and visit_start_pagename not like '%account.box%' then concat(post_visid_high,post_visid_low) end))
FROM `adobe.adobe_data`
where date_trunc(date_time,month) = '2025-02-01' and exclude_hit = 0
Adobe Segments: 105,959
SQL Output: 105,693
Variance: 0.3%
Views
Replies
Total Likes
Hi @czmudzin ,
The logic used in SQL and AA segment for second comparison is not the same. In SQL you are excluding all the entry pages by using "visit_page_num <> 1" clause while in AA segment you are still using Entry Page in the logic "Exclude hit where entry page contains account.box".
Another thing, in the SQL for second comparison you haven't added the clause for checking the entry page value like you have in the segment.
Try below query to replicate the logic you have in AA reports,
select
count(distinct(
case when visit_page_num = 1 and post_evar4 like '%account.box%' and post_pagename not like '%account.box%' then concat(post_visid_high,post_visid_low) end))
FROM `adobe.adobe_data`
where date_trunc(date_time,month) = '2025-02-01' and exclude_hit = 0
Cheers!
Views
Replies
Total Likes
@Harveer_SinghGi1 Thanks for your reply.
You stated that I am excluding all entry pages, but my SQL logic has an and operator:
case when visit_page_num <> 1 and post_evar4 like '%account.box%'
(Note that post_evar4 is the same as post_pagename, but more reliable in our instance)
This logic should exclude only hits where the pagename contains account.box and where it's the first hit of the visit.
I guess I don't understand why entry page (Adobe UI) and visit_page_num are giving different counts.
Views
Replies
Total Likes
@Harveer_SinghGi1 Also you can see the discrepancy here.
Hit depth = 1 and Entry Page contains account.box are pretty close (green)
Using the inverse of the same logic, the numbers are very different (red)
Views
Replies
Total Likes
Sorry for all of the replies. I found a solution by not using visit_page_num and instead replacing the logic with visit_start_pagename. This gave me a 0.3% difference:
select
count(distinct(
case when post_pagename like '%account.box%' and visit_start_pagename not like '%account.box%' then concat(post_visid_high,post_visid_low) end))
FROM `adobe.adobe_data`
where date_trunc(date_time,month) = '2025-02-01' and exclude_hit = 0
Adobe Segments: 105,959
SQL Output: 105,693
Variance: 0.3%
Views
Replies
Total Likes
Views
Likes
Replies
Views
Likes
Replies