Single Page Visit and Bounces Calculation | Community
Skip to main content
August 25, 2022
Question

Single Page Visit and Bounces Calculation

  • August 25, 2022
  • 2 replies
  • 1577 views

I tried a few different methods to calculate Single Page Visit using DataFeed and none of them matched the number in Adobe Workspace, counted visits with exactly one pagename, one page_url, one post_pagename, one post_page_url, one hit, one visit_page_num, similar first and last hit time.  

I am wondering how to use datafeed to calculate the Single Page Visits. Thanks a lot.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

2 replies

VaniBhemarasetty
Adobe Employee
Adobe Employee
August 25, 2022

Please try to add the exclude_hit column as well to your query. Also, this is the definition 

The ‘Single page visits’ dimension reports the number of visits that consisted of a single unique Page dimension item. It is the dimension form of the Single page visits metric.

That means you should calculate all the visits which have unique post_pagename

MiladShAuthor
August 26, 2022

,Hi Vani, Thanks for your reply. 

based on the query, hit sources (5,7,8,9) are removed from the data, and the exclude_hit is set to 0. 

Then visits with just 1 post_pagename are selected. The "Single Page Visits" are not matched with the Workspace. I am wondering if I miss more filters

 

Query:

df_t.filter((col('exclude_hit') == 0)).filter(udf_remove_hitsrc(col('hit_source'))).select('post_visid_high' , 'post_visid_low' , 'visit_num' , 'post_pagename').groupby('post_visid_high' , 'post_visid_low' , 'visit_num' ).agg(F.countDistinct('post_pagename').alias('n_')).filter(col('n_') == 1).select('post_visid_high' , 'post_visid_low' , 'visit_num' ).distinct().count()
leocwlau
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
August 26, 2022

Doing this with Spark and Python

First, get the number of visits for each month by counting unique visit id.

month_visits = (
hit_data
.select(
F.col('date_time').substr(1,7).alias('month'),
F.concat_ws("_", "post_visid_high", "post_visid_low", "visit_num", "visit_start_time_gmt").alias('vid')
)
.groupBy('month')
.agg(F.countDistinct('vid').alias('visits'))
.orderBy('month')
)
Then get the number of bounces per month by counting how many visits have only one hit.
month_bounce = (
hit_data
.select(
F.col('date_time').substr(1,7).alias('month'),
F.concat_ws("_", "post_visid_high", "post_visid_low", "visit_num", "visit_start_time_gmt").alias('vid')
)
.groupBy('month','vid').count()
.filter(F.col('count')==1)
.groupBy('month').agg(F.count('count').alias('bounce'))
.orderBy('month')
)
Finally, simply joining the two data frames and calculating the bounce rate by bounces/visits
(
month_visits
.join(month_bounce, on='month')
.withColumn('bounce_rate', F.col('bounce') / F.col('visits'))
.orderBy('month')
.show()
)
MiladShAuthor
August 26, 2022

Hi Leo, Thank you so much for your time and reply.

I found out we need to add filters on exclude_hit, and hit_source also to make accurate counts. 

I am wondering have you tried creating the number of "Single Page Visits" using DataFeed? 

leocwlau
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
August 29, 2022

Hi @miladsh, the following block is indeed getting the number of single page visits, it concatenates the for columns (post_visid_high, post_visid_low, visit_num, visit_start_time_gmt) to create a single unique visit id then count how many lines of record (hit) for each visit, then filter those with only one single hit in the visit. This gives the number of single page visits. I do not using the exclude_hit as we don't have many cases requiring this filtering for a matched number against AA workspace.

 

month_bounce = (
hit_data
.select(
F.col('date_time').substr(1,7).alias('month'),
F.concat_ws("_", "post_visid_high", "post_visid_low", "visit_num", "visit_start_time_gmt").alias('vid')
)
.groupBy('month','vid').count()
.filter(F.col('count')==1)
.groupBy('month').agg(F.count('count').alias('bounce'))
.orderBy('month')
)