In this article, we would look at a very simple demonstration of how we can use the linear regression model to predict the estimated revenue based on the number of texts sent out in Adobe Analytics and would compare the calculations usingsklearn.
In this digital age, businesses are collecting huge amounts of data to gain insights about their customers, serve them with relevant content, and to optimize their business. One key part of optimization is to make the best use of the marketing spend on the most influential channels based on the impact each channel/touchpoint has on a conversion.
One way to align the marketing spend for different campaigns is to get their impact on conversion and try adjusting the spend for all channels such that the conversions are high. Businesses use multiple statistical approaches to attain this level of optimization. Adobe Analytics helps us with some out of the box metrics which makes the life easier for a marketer to use these statistical functions. Lets see how this works.
What are linear regression models?
Linear regression models are used to show or predict the relationship between two variables or factors. Linear regression is widely used in the industry today for making predictions of a value based on the change in input factors. The linear regression line has the equation of:
Y = a + bX,
Yis the dependent variable
Xis the explanatory variable
ais the intercept
bis the slope of the curve
Now, in this demonstration we could be using the following elements which are already configured in an analytics report suite:
Figure 1: Elements in Analytics Workspace
Now, let’s see how data looks in Analysis Workspace:
Figure 2: Trendline in Analytics WorkspaceFigure 3: Revenue & Text Sent Table in Analytics WorkspaceNow, we would like to get an estimated increase in revenue if we choose to send 25,000 texts and 50,000 texts. For this, we would be creating a calculated metrics which would get us an estimate of the revenue expected. This will be done using the formula mentioned above.
Here we have advanced statistical functions available in the calculated metrics space for intercept and slope.
Figure 4: Calculated metric builder in Analytics Workspace
So, now we will create a calculated metric called Estimated Revenue @ 25000 Text and add the description for the metric. For this, we will open up the calculated metric builder and search for Linear regression : Intercept and add it to the formula bar in the builder. Now, this function requires two input metrics: metrix_X and metric_Y which are Texts Sent event and Revenue respectively. Next, we drag-drop these metrics to their respective blanks in the builder. This is how the calculated metric builder would look like:
Figure 5: Calculated metric builder view in Analytics Workspace
Next, we add another container to add additional function values:
Figure 6: Another container to add function values in Analytics Workspace
Now, we search forLinear regression : Slopeand drag it to the second container and set the operator to+between the two containers, and add the same metrics which we added in the first container for metric_X and metric_Y. Next, in the second container, we click on Add and chooseStatic number.
Figure 7: Add Static number
Next, we change the operator between slope and static number to “X” and enter25000in the static number column. For the simplicity of the demo, we left to include zeros unselected and save the calculated metric with the format as currency.
Next, we open up the calculated metric and replace 25000 with 50000 in the name, description, and the static number field and use theSave Asoption to save this as the second prediction metric. As we have both the estimated revenue metrics ready now, add both the metrics to the same table. Now we get two lines for the predicted revenue values. The graph looks like this:
Figure 8: Revenue Metrics and Predictive Revenue Dashboard
The values can be seen in the freeform table:
Figure 9: Values of Figure 8 dashboard
The values which we are seeing are as follows: Estimated Revenue for 25000 and 50000 Text Sent is — $60119.91 and $139657.08 respectively.
Now let’s validate this in Jupyter notebook:
Open a notebook and run the following commands in order:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
text_sent = np.array([17490,16408,16599,15734,17128,16790,17624,18414,20218,15294,16962,18509,16789,16406,15487,16686,19372,17686,16102,16955,17653,17318,17111,18819,17866,16811,17388,16261,18518,16656,15926,16340,18456,18323,16100,11778,12198,15367,17114,18204,15308,17817,18191,20134,17728])
revenue = np.array([30558.68,27670.99,35314.49,35727.71,35004.27,46590.4,57561.23,40382.05,36280.23,40021.67,40013.32,52243.1,30624.98,25000.66,34606.89,44107.78,56486.35,48119.19,26327.85,29922.3,40704.71,28114.83,34379.26,38558.02,31622.02,30102.82,33495.5,35259.82,28400.88,26217.87,42847.65,34758.68,51224.18,32468.67,27971.82,18426.69,8462.23,29888.69,46552.6,34761.04,23668,26129.11,28615.34,39937.2,18231.33])
linreg = LinearRegression()
text_sent = text_sent.reshape(-1,1)
predicted_revenue = linreg.predict(text_sent)
plt.plot(text_sent, predicted_revenue, color='red')
plt.title("Revenue by Text Sent")