Authors: Frederik Werner, Akintunde Ajayi, and Nikhil Maddirala

Frederik Werner is a German Analytics Lead and Data Scientist working in and writing about Web Analytics and Online Marketing Technology. In this second post of a four-part series, Frederik will provide the overview and get-started tips on Adobe Experience Platform Query Service. He gives examples of his favorite tricks using Query Service and specifically for customer journey analysis use cases in Customer Journey Analytics. It also includes how to write back data to Adobe Experience Platform.
Query Service is a feature of Adobe Experience Platform. Adobe Experience Platform provides SQL analytics and AI/ML capabilities that enable enterprises to generate and operationalize customer intelligence for driving improved personalized customer experiences.
Adobe-defined functions: Sessionization
Recreating Visits using Sessionization
One of the surprises I encountered when starting with Adobe Experience Platform is that the Adobe Analytics Connector does not include the Visit Number dimension (which is one of my beloved favorites) although the Schema has a key for it. When I pointed this out to Adobe, they referred me to the Sessionization features of Query Service. As I will show, this is not equivalent at all but still very cool, especially for some more creative use cases.
Quoting from the documentation, the syntax for this function looks like this:
SESS_TIMEOUT(timestamp, timeout_in_seconds) OVER ([partition] [order] [frame])
We can see that it is a window function with quite a bunch of parameters to tune. It will return a Struct with four fields, timestamp_diff
(containing the number of seconds since the start of the Visit), num
(containing the Visit Number), is_new
(containing a boolean value for the first hit of a Visit), and depth
(containing the number of a hit in a Visit), giving us everything we need to recreate Visits from Adobe Analytics.
If we want a simple session for a given user, we could use a query like this, where our table contains a timestamp
and user_id
column:
sess_timeout(timestamp, (60 * 30)) OVER (PARTITION BY user_id ORDER BY timestamp ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) session
Let’s go through this. The parameters of the sess_timeout
the function is pretty intuitive. They tell Query Service to define a session based on the timestamp of our rows and allow up to 30 minutes (60 seconds times 30) between two rows as the session limit. The next part is a bit more complicated since we have to tell the sess-timeout
function of how we want to define a User (or Visitor). To achieve this, we need to partition our data by the user_id
column (returning only the data for the same user_id
that the current row has) and order it by timestamp
in ascending order. The last part defines that we only want to look at the data before the current row, so we don’t end up looking in the future.
This will give us a result like this:
Figure 1: Sessionized data from Query Service
This shows two things: For three rows I assumed timestamps in Adobe Experience Platform were in seconds, while they actually are in milliseconds. And our sessionization works nicely, giving us the four values we expect. Now we successfully have recreated Visits in Adobe Experience Platform. But there is a lot more we can do with this function by modifying how we window our data. For example, we could just add the page dimension like this:
sess_timeout(timestamp, (60 * 30)) OVER (PARTITION BY userid, page ORDER BY timestamp ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) pagesession
Instead of just partitioning our data by userid
, I now threw the page column in there as well. That way we get a new session whenever the user moves to a new page. But why would we want this? This would not replace our normal session but provide some very interesting insights, like: What is the 3rd action our users take on a certain page? How long does it take them to find a certain popup on a page? Here are some examples of how to partition data in a more creative way:
- Partition by
userid
and Form name for form tracking: This would show in which order our users fill out our forms and how long each step takes. We would also have the number of the form fillout-session
within a Visit where multiple forms are used.
- Partition by
userid
and Ecommerce product: What is the 2nd action our users do with a product in succession? Do they remove it from the cart immediately after adding it?
- Partition by
userid
and video name: What are our users doing with our video content? Is the second action a pause or closing the video? How long do they keep a video paused before resuming it?
All of this can be done by applying the concept of a session in a more creative way. Let me know which criteria come to your mind.
Getting creative with Sessionization
Let’s dig a bit deeper. Recreating Adobe Analytics sessions in Query Service is easy thanks to the SESS_TIMEOUT function. As a quick reminder: It’s a window functions that returns four values for each row of data, giving us something equivalent to the “Hit Depth” and “Time spent” values in Adobe Analytics. Again, this is the syntax:
SESS_TIMEOUT(timestamp, timeout_in_seconds) OVER ([partition] [order] [frame])
To get the standard Visit from Analytics back, we would use it like this:
sess_timeout(timestamp, (60 * 30)) OVER (PARTITION BY user_id ORDER BY timestamp ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) session
Which gives us an output like the rightmost column (in JSON format for easier readability):
Figure 2: Result table of sessionized hits
Now we have our Analytics Visits back! timestamp_diff gives us both the time since the last event (seconds since the last row of a visit) and, for the first hit, the time since the last visit (see timestamp_diff of the second row).
SQL veterans will recognize this Adobe Defined Function as a normal variation on window functions. So let’s have some fun! For example, what would happen if we reversed the order of rows in the window frame? Let’s try this (note the “desc” instead of “asc”) and see what we get:
sess_timeout(timestamp, (60 * 30)) OVER (PARTITION BY user_id ORDER BY timestamp DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) reverse_session
Gives us:
Reverse sessions with Query Service
First thing to recognize: The num from reverse_session
is useless now since the functions is pretty confused by our reversed timestamps. But that doesn’t matter, we already have that information in the normal session output. The rest is in turn very helpful: depth now gives us an indication of how close to the end of a session an action occurred. Now we can easily answer questions like “what was the last thing a user did in their session?”, or “how many events happened after the purchase event in a session?” which are all quite helpful things to know.
Wrap Up
I hope you found this article helpful. You should be able to create some advanced fields with Query Service and write it back to an existing or new dataset. Also, you now know how to find out what went wrong if datasets can’t be written back successfully. There will be 3 more subsequent blogs with my other favorite tips using Query Service.
Will I ever return to normal Adobe Analytics after all this Query Service and Customer Journey Analytics work? Yes, of course. It’s still my main driver for day-to-day analysis. But with all the exciting information I can get out of my boring old data I will gradually spend more and more time in Query Service and Customer Journey Analytics in the future.
This blog originally appeared The Full Stack Analyst on 10/21/2020.
Follow the Adobe Experience Platform Community Blog for more developer stories and resources, and check out Adobe Developers on Twitter for the latest news and developer products. Sign up here for future Adobe Experience Platform Meetups.
References
- Adobe Experience Platform
- Adobe Experience Platform Query Service Tutorials
- Adobe Experience Platform Query Service Web Page
- Customer Journey Analytics Web Page
Related Blogs
Originally published: Dec 17, 2020
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.