Expand my Community achievements bar.

SOLVED

How to implement "Then within a hit" segment logic in sql using data feed

Avatar

Level 1

I try to implement the 'Then within a hit' logic in sql but couldn't get the correct result with the dashboard. How to implement this logic? 

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor and Adobe Champion

Oh yes, that is something that is really important to note. When you use data warehouse and/or data feeds, it doesn't low traffic bucket any of the variables, you see all the values in the dimension, unlike in workspace. Glad that you figured it out!

View solution in original post

5 Replies

Avatar

Community Advisor and Adobe Champion

Data feeds are an interesting beast. The thing to keep in mind with them is that every hit comes in on a separate row. So if you're trying to get "x THEN y", you're going to need to look across multiple rows. 

First, determine if you want both conditions to exist within the same visit or visitor. For visitor, concatenate post_visid_high, post_visid_low, and first_hit_time_gmt. For visit, add visit_num to the other 3 columns. 

Second, you'll need to understand the order of the hits within the visit/visitor. You can use visit_page_num to order them. It's basically the hit depth of each hit.

Third, you need to identify your conditions. Get the hit number associated with X and get the hit number associated with Y. 

Then you need to write a logical comparison, making sure that Y happens after X. If you want it to be the very next hit, then your logical comparison should make sure that Y = X + 1 (using visit_page_num). 

 

This should help you get started.

Avatar

Level 1

1. I use post_visid_high || post_visid_low || visit_start_time_gmt || visit_num as a unique identifier for each visit. Is that the same?
2. I tried date_time and visit_page_num for both but none of them are returning the same result. 

3. What do you mean by hit number? 

 

This is a more detailed explanation of my issue.
The conditions I implemented on the dashboard is: Count the number of visits with a hit of evar10 exists and then within one hit, pagename (evar5) contians "string"

 

I use the following sql: 

WITH CTE AS (
SELECT

post_visid_high || post_visid_low || visit_start_time_gmt || visit_num as visid, 

post_evar10,

LEAD(post_evar5, 1) OVER(PARTITION BY  post_visid_high || post_visid_low || visit_start_time_gmt || visit_num ORDER BY date_time) AS next_hit_pagename
FROM my_table) 

SELECT COUNT(DISTINCT visid) 

FROM CTE

WHERE (post_evar10 != '' OR post_evar10 NOT NULL) AND next_hit_pagename LIKE '%string%'. 

it's giving me different number from the dashboard, so I was wondering how the 'then within next hit' conditions should be applied. 

 

 

Avatar

Community Advisor and Adobe Champion

The main difference between visit_start_time_gmt and first_hit_time_gmt is that the first one is at a visit level and the second is at a visitor level. We've noticed that user ID's (post_visid_high and post_visid_low) can get resused over time (often it happens at least a year apart). This means that the same visitor ID can have two visit #1, etc. The first_hit_time_gmt is tied to the visitor, so even if post_visid_high and post_visid_low get reused, they will have a different first_hit_time.

 

The hit number would be the depth of the hit within the visit, in the data feeds, it's the column visit_page_num. It counts both custom link instances and page views. 

 

What is the expiry on your evar10? Post_evar10 is going to give you a value on all the hits that it fired and all the hits that it persisted. If you want to make sure it's a hit where evar10 actually fired you will want to use your event list and make sure evar10 instances fired too.

 

I think the way that I would do this would be to join the table to itself.

Make a CTE for all the hits with evar10 (including the instances metric to make sure it actually fired and not persisted). 

Make a separate CTE for all the hits with pagename contains "string".

In both CTE's, make sure to include your visitor ID information, visit info, and visit_page_num.

 

I would then join the CTEs where evar10.visit_page_num = (pagename.visit_page_num + 1). Also join on your visit and visitor IDs too.

 

Avatar

Level 1

Thank you for the reply! I just check and confirm that the difference we are seeing is not related to the then within 1 hit logic. It's due to the low traffic variables that are not counted within the segment. 

Avatar

Correct answer by
Community Advisor and Adobe Champion

Oh yes, that is something that is really important to note. When you use data warehouse and/or data feeds, it doesn't low traffic bucket any of the variables, you see all the values in the dimension, unlike in workspace. Glad that you figured it out!