Adobe Employee

☕[AT Community Q&A Coffee Break] 10/14/20, 8am PT: Jon Tehero, Group Product Manager for Adobe Target☕ [SERIES 2]

Forum|Forum|5 years ago
September 29, 2020
11 replies
11598 views

Join us for our next monthly Adobe Target Community Q&A Coffee Break,

taking place Wednesday, October 14th @ 8am PT

👨‍💻☕👩‍💻Register Now!👨‍💻☕👩‍💻

We'll be joined by Jon Tehero aka @jontehero, Group Product Manager for Adobe Target, who will be signed in here to the Adobe Target Community to chat directly with you on this thread about your Adobe Target questions pertaining to his areas of expertise:

AI improvements
A4T for Auto-Target
Slot-based Recommendations
General Adobe Target backend & UI

Want us to send you a calendar invitation so you don’t forget? Register now to mark your calendar and receive Reminders!

A NOTE FROM OUR NEXT COMMUNITY Q&A COFFEE BREAK EXPERT, JON TEHERO

REQUIREMENTS TO PARTICIPATE

Must be signed in to the Community during the 1-hour period
Must post a Question about Adobe Target
THAT'S IT! *(think of this as the Adobe Target Community equivalent of an AMA, (“Ask Me Anything”), and bring your best speed-typing game)

INSTRUCTIONS

Click the blue “Reply” button at the bottom right corner of this post
Begin your Question with @jontehero
When exchanging messages with Jon about your specific question, be sure to use the editor’s "QUOTE" button, which will indicate which post you're replying to, and will help contain your conversation with Jon

Jon Tehero is a Group Product Manager for Adobe Target. He’s overseen hundreds of new features within the Target platform and has played a key role in migrating functionality from Target's classic platforms into the new Adobe Target UI. Jon is currently focused on expanding the Target feature set to address an even broader set of use-cases. Prior to working on the Product Management team, Jon consulted for over sixty mid- to enterprise-sized customers, and was a subject matter expert within the Adobe Consulting group.

Curious about what an Adobe Target Community Q&A Coffee Break looks like? Check out the threads from our first Series of Adobe Target Community Q&A Coffee Breaks

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

Shani2

Level 2

Hi @Jon_Tehero thank you for your time today - these coffee break sessions are great 🙂 I wanted to share our experience regarding A4T offline significance calculations. As a whole, the process is quite time consuming and most of our experiments require that we do offline calculations – it would be more practical if the Target UI/Analytics Reporting/A4T workspace panel could compute calculated metrics, but even improving the performance of Data Warehouse UI would be a great improvement to the process, since it's a requirement for those of us that select A4T as the reporting source in the activity.

With the current process, monitoring tests does not seem really practical b/c of the manual processes involved, but it is truly a requirement for high-risk tests and there is no way around it, we can't wait for the test to reach the required sample size/# of conversions to end the test and complete the significance analysis.

Currently we have to take these steps to perform offline significance calculations for A4T activities due to analytics continuous variables: 1) we have to create multiple segments that are compatible with data warehouse, 2) pull various reports from data warehouse to break down the data in a digestible format, 3) after the report is available sometimes hours or a day or 2 later, we then have to enter formulas in excel to calculate visitors and compute sum of success metric squared 4) followed by inputting the data into the excel spreadsheet confidence calculator.

In general, the process makes monitoring tests difficult, very time consuming, and I would go as far and say it may even discourage the monitoring cycle of the testing process because it requires a lot of effort. The level of effort required isn't ideal either after a test ends, but re-doing these steps in week intervals for example when a test is running for a test that could be identified as high-risk for the business isn't practical.

Is there an improvement to this process in the road-map or any recommendation on how to create efficiencies with existing functionality? There isn't much information I could find in the Adobe Cloud documentation that provided alternative solutions, but was hoping you could provide more insight to future improvements or potentially other ways that we can achieve the same result with less effort?

Thank you!

JonTehero

Adobe Employee

@shani2 wrote:
Hi @Jon_Tehero thank you for your time today - these coffee break sessions are great 🙂 I wanted to share our experience regarding A4T offline significance calculations. As a whole, the process is quite time consuming and most of our experiments require that we do offline calculations – it would be more practical if the Target UI/Analytics Reporting/A4T workspace panel could compute calculated metrics, but even improving the performance of Data Warehouse UI would be a great improvement to the process, since it's a requirement for those of us that select A4T as the reporting source in the activity.

With the current process, monitoring tests does not seem really practical b/c of the manual processes involved, but it is truly a requirement for high-risk tests and there is no way around it, we can't wait for the test to reach the required sample size/# of conversions to end the test and complete the significance analysis.

Currently we have to take these steps to perform offline significance calculations for A4T activities due to analytics continuous variables: 1) we have to create multiple segments that are compatible with data warehouse, 2) pull various reports from data warehouse to break down the data in a digestible format, 3) after the report is available sometimes hours or a day or 2 later, we then have to enter formulas in excel to calculate visitors and compute sum of success metric squared 4) followed by inputting the data into the excel spreadsheet confidence calculator.

In general, the process makes monitoring tests difficult, very time consuming, and I would go as far and say it may even discourage the monitoring cycle of the testing process because it requires a lot of effort. The level of effort required isn't ideal either after a test ends, but re-doing these steps in week intervals for example when a test is running for a test that could be identified as high-risk for the business isn't practical.

Is there an improvement to this process in the road-map or any recommendation on how to create efficiencies with existing functionality? There isn't much information I could find in the Adobe Cloud documentation that provided alternative solutions, but was hoping you could provide more insight to future improvements or potentially other ways that we can achieve the same result with less effort?

Thank you!

Hi Shani2,

Thank you for your question! We've received a lot of request for supporting calculated metrics. We know that this would improve the overall process/workflow for our customers. My peers on the Analytics product management team have this feature in their backlog but we do not have any specific dates at this time.

If you have access to the Experience Platform and have your analytics data landing on platform, the query service is probably the best option for achieving this today.

drewb6915421

Adobe Employee

@shani2 Check out this Spark page on additional best practices for leveraging A4T: https://spark.adobe.com/page/Lo3Spm4oBOvwF/

Shani2

Level 2

Hi again, @Jon_Tehero, I have a statistical question. Is there a feature in the roadmap for the user to be able to select a one tailed test as the statistical method in the Target UI during the experience setup? Currently, by default, the Target statistical engine is configured to support a two tailed test as well as the Sample Size calculator. Technically we can manually adjust the significance level in the sample size calculator if we wished to convert to a single tailed test and manually choose to end a test based on a set of given parameters, however, it would be a lot less effort if that could be automated by being able to select one-sided vs two-sided tests in the UI. One-tailed tests require less time to run and has a lower error probability than two-tailed tests (i.e. alpha is not halved) if testing for a specific direction (i.e. positive/negative). If for example. we are truly testing for superiority, then testing for a negative impact has no value and the cost is having a test run for longer than needed for the business question of interest. As we look for more ways to be agile and become more efficient with our testing program, I am eager to learn if expanding alternative ways of computing statistical significance is in Target’s product roadmap? Thank you!

JonTehero

Adobe Employee

@shani2 wrote:
Hi again, @Jon_Tehero, I have a statistical question. Is there a feature in the roadmap for the user to be able to select a one tailed test as the statistical method in the Target UI during the experience setup? Currently, by default, the Target statistical engine is configured to support a two tailed test as well as the Sample Size calculator. Technically we can manually adjust the significance level in the sample size calculator if we wished to convert to a single tailed test and manually choose to end a test based on a set of given parameters, however, it would be a lot less effort if that could be automated by being able to select one-sided vs two-sided tests in the UI. One-tailed tests require less time to run and has a lower error probability than two-tailed tests (i.e. alpha is not halved) if testing for a specific direction (i.e. positive/negative). If for example. we are truly testing for superiority, then testing for a negative impact has no value and the cost is having a test run for longer than needed for the business question of interest. As we look for more ways to be agile and become more efficient with our testing program, I am eager to learn if expanding alternative ways of computing statistical significance is in Target’s product roadmap? Thank you!

@shani2,

Thank you for sharing your feedback on one-tailed experiments. We do not have anything on our roadmap at this time for supporting one-tailed tests.

Shani2

Level 2

@Jon_Tehero Sorry, I am full of questions, you can tell I couldn’t wait for this event 😊 this should be my last one and it’s regarding the AB AA algorithm. The concept/mechanism of AA is a great one, however, I think there are some limitations in a particular use case. I won’t go into the benefits of AA as those are plentiful, but do want to highlight one specifically, i.e. optimization occurs in parallel with learning.

Moreover, in the event there are only 2 experiences, I think there is a true risk with false positives (higher than 5%) with the current algorithm logic, i.e. after the better performing experience reaches 95% confidence, 100% of traffic is allocated to the experience identified as the winner. Unlike the logic for 3 or more experiences in which 80% of traffic is allocated to the winner and 20% of traffic continues to be served randomly to all experiences – this is key in the event there are user behavior shifts and confidence intervals begin to overlap with other experiences while the test is running.

I’ve encountered a few experiences using Target’s manual A/B test in which the stats engine has called a winner early and a badge was displayed in the activity, however, after hours/days/weeks of collecting more data, the engine removes the badge as it recognizes that confidence levels are still overlapping/fluctuating. This is a prime example of how important it is to determine sample size/tests parameters before running a test to prevent ending a test prematurely and to ensure statistically valid results, but also why I raise my concern with the AA logic specifically for the 2 experiences scenario. Currently, there is no room for the algorithm to correct itself in the event it identified an experience as a winner that truly was not because there isn’t a reserve of traffic allocated for learning if user behavior changes – this is not truly a multi-armed bandit approach in this use case because after 95% confidence is reached optimization no longer occurs in parallel with learning.

Furthermore, another concern on the logic of the algorithm for two experiences is that hypothetically we cannot detect a novelty effect because the algorithm may declare an experience a winner too early. We have observed novelty effects after adding a new feature that is attention grabbing in manual A/B tests, for the first two weeks a challenger may be performing better than the default experience and display a badge, but with time the positive effect wears out as more data is collected – confirming that the lift was only an illusion.

In sum, I hesitate using AA for 2 experiences due to the current AI logic. But the dilemma is that we don’t tend to test in our organization more than 2 experiences. Are there any suggestions on how we can mitigate false positives for 2 experiences for AA? Is enhancing the algorithm for two experiences in the roadmap so that it serves as a true multi-armed bandit approach to optimization? Lastly, in the product roadmap, will users have the ability to set the significance level for AI driven activities? Not all tests are created equal, therefore, they will not have the same risks/costs, thus, some tests may require a false positive-rate less or more than 5%.

Please note I am aware of the time-correlated caveat for AA and the experiences I discussed above re Manual A/B tests were not contextually varying.

Thank you!

JonTehero

Adobe Employee

@shani2 wrote:
@Jon_Tehero Sorry, I am full of questions, you can tell I couldn’t wait for this event 😊 this should be my last one and it’s regarding the AB AA algorithm. The concept/mechanism of AA is a great one, however, I think there are some limitations in a particular use case. I won’t go into the benefits of AA as those are plentiful, but do want to highlight one specifically, i.e. optimization occurs in parallel with learning.

Moreover, in the event there are only 2 experiences, I think there is a true risk with false positives (higher than 5%) with the current algorithm logic, i.e. after the better performing experience reaches 95% confidence, 100% of traffic is allocated to the experience identified as the winner. Unlike the logic for 3 or more experiences in which 80% of traffic is allocated to the winner and 20% of traffic continues to be served randomly to all experiences – this is key in the event there are user behavior shifts and confidence intervals begin to overlap with other experiences while the test is running.

I’ve encountered a few experiences using Target’s manual A/B test in which the stats engine has called a winner early and a badge was displayed in the activity, however, after hours/days/weeks of collecting more data, the engine removes the badge as it recognizes that confidence levels are still overlapping/fluctuating. This is a prime example of how important it is to determine sample size/tests parameters before running a test to prevent ending a test prematurely and to ensure statistically valid results, but also why I raise my concern with the AA logic specifically for the 2 experiences scenario. Currently, there is no room for the algorithm to correct itself in the event it identified an experience as a winner that truly was not because there isn’t a reserve of traffic allocated for learning if user behavior changes – this is not truly a multi-armed bandit approach in this use case because after 95% confidence is reached optimization no longer occurs in parallel with learning.

Furthermore, another concern on the logic of the algorithm for two experiences is that hypothetically we cannot detect a novelty effect because the algorithm may declare an experience a winner too early. We have observed novelty effects after adding a new feature that is attention grabbing in manual A/B tests, for the first two weeks a challenger may be performing better than the default experience and display a badge, but with time the positive effect wears out as more data is collected – confirming that the lift was only an illusion.

In sum, I hesitate using AA for 2 experiences due to the current AI logic. But the dilemma is that we don’t tend to test in our organization more than 2 experiences. Are there any suggestions on how we can mitigate false positives for 2 experiences for AA? Is enhancing the algorithm for two experiences in the roadmap so that it serves as a true multi-armed bandit approach to optimization? Lastly, in the product roadmap, will users have the ability to set the significance level for AI driven activities? Not all tests are created equal, therefore, they will not have the same risks/costs, thus, some tests may require a false positive-rate less or more than 5%.

Please note I am aware of the time-correlated caveat for AA and the experiences I discussed above re Manual A/B tests were not contextually varying.

Thank you!

@shani2,

Our logic for 2 experiences and for more than 2 experiences is actually same (in both scenarios, once a winner is declared, we will allocate 80% of traffic to the winner and the remaining 20% traffic is split among all experiences). So in a case where 2 experiences are present, at the time we declare a winner, we'll send 90% of traffic to the winning experience, and 10% of traffic to the other experience.

If for any reason you are seeing behavior different than what I've described above, please submit a ticket to customer care so that we can look take a look.

JonTehero

Adobe Employee

Hello everyone! I am looking forward to chatting with you in a few minutes and answering your questions.

frfr12ae

@jontehero
What is the best way to combine testing methods (a/b testing while personalising - XT & AB). To make sure that you are always improving?

JonTehero

Adobe Employee

@frfr12ae wrote:

@jontehero
What is the best way to combine testing methods (a/b testing while personalising - XT & AB). To make sure that you are always improving?

Hello @frfr12ae ,

Great question! Within our A/B activity, you can do both personalization and experimentation. This allows you to test out your personalization techniques and learn what is working the most. You can get pretty sophisticated in testing your personalization by utilizing a feature that allows you to serve different variations of the same experience to different audiences. This could come in handy if you were personalizing content across different geos with different languages, etc.

hartung

Adobe Employee

Hi Everyone! I'm looking forward to hearing from Jon today. He always impresses with his Target and Recommendations depth and breadth of knowledge.

mattravlich

Level 2

@Jon_Tehero Hi -> with the new A4T view in workspace, will we ever use calculated metrics within the view to get confidence levels? Any tips for using this now? Thanks.

drewb6915421

Adobe Employee

@mattravlich Analytics team is working on enabling support for calculated metrics, but the complexity arises with how Analytics collects data based on visitors. Good suggestions on best practices for using A4T and success metrics on this Spark page: https://spark.adobe.com/page/Lo3Spm4oBOvwF/

mattravlich

Level 2

@drewb6915421 -> the link doesn't seem to work. It just redirects me back to this thread.

hartung

Adobe Employee

Hey Jon,

@wixyfun asked the following in our forums:

Hi,

I have some issues with Catalog and inclusion rules comparisons in adobe target recommendations.

I am trying to use " is less than or equal to" to compare same entity attribute values that are numeric and it looks like adobe thinks of the attribute values as strings , so only when using "equals" I get a correct entity match.

For example, entity.hoursSinceUpdate = 358 is returned when searching/filtering for hoursSinceUpdate is less than or equal to 72

What could be causing this issue?

Should I delete entities from the catalog or try updating them instead?

Any suggestions are appreciated, thank you

https://experienceleaguecommunities.adobe.com/t5/adobe-target-questions/entity-attributes-cannot-be-compared-using-less-than-or-equals/qaq-p/378360

JonTehero

Adobe Employee

@hartung wrote:
Hey Jon,
@wixyfun asked the following in our forums:

Hi,
I have some issues with Catalog and inclusion rules comparisons in adobe target recommendations.
I am trying to use " is less than or equal to" to compare same entity attribute values that are numeric and it looks like adobe thinks of the attribute values as strings , so only when using "equals" I get a correct entity match.
For example, entity.hoursSinceUpdate = 358 is returned when searching/filtering for hoursSinceUpdate is less than or equal to 72
What could be causing this issue?
Should I delete entities from the catalog or try updating them instead?
Any suggestions are appreciated, thank you

https://experienceleaguecommunities.adobe.com/t5/adobe-target-questions/entity-attributes-cannot-be-compared-using-less-than-or-equals/qaq-p/378360

Hello @wixyfun,

Thank you for your question. You are correct in that the "is less than or equal to" operator is looking for a numeric value. However, when you create a custom entity.attribute, the default data type is string. If you reach out to customer care, they can submit a ticket to convert the field type to a numeric field so that the evaluation will work as intended. I will make sure we add this information to our documentation as well.

mattravlich

Level 2

@jontehero do you have any tips for scaling the testing amount within a SPA (single-page application)?

Amelia_Waliany

Author

Adobe Employee

Hi @jontehero! Thank you for your time today 🙂

@mau_gloria posted this question in the Community:

Hi, Im trying to set a recommendations module on a page but these are not pushed to the webpage. The only way to see them is by using the "Preview" link included in Adobe Target, which is pulling 3 different offers along with their URLs. Feeds are working fine (scheduled every day in the morning) and in the overview screen of the activity I saw "Results ready" with green indicator.

- I tried QA link, nothing.

- I tried visiting the page, nothing.

Using at.js 2.1.

LINK TO ORIGINAL POST

JonTehero

Adobe Employee

@amelia_waliany wrote:
Hi @jontehero! Thank you for your time today 🙂
@mau_gloria posted this question in the Community:
Hi, Im trying to set a recommendations module on a page but these are not pushed to the webpage. The only way to see them is by using the "Preview" link included in Adobe Target, which is pulling 3 different offers along with their URLs. Feeds are working fine (scheduled every day in the morning) and in the overview screen of the activity I saw "Results ready" with green indicator.
- I tried QA link, nothing.
- I tried visiting the page, nothing.
Using at.js 2.1.
LINK TO ORIGINAL POST

I concur with what @karandhawan said and would recommend using the "?content-trace=true" option as well.

A couple of additional tips:

depending on the type of algorithm, it may require a "key" (which is the entity or category that you are basing the recommendations on. example: for "people who viewed this also viewed these..." the "this" represents the key).
When entities are brand new or when an algorithm first runs, as recommendations are requested, we push the results of the algorithm and the details of the entity to our edges. Sometimes you may need to refresh a couple of times to allow the results to fully propagate to the edge. This is generally something that is only really felt while QA'ing the activity and is limited at most to the first couple of views of an entity.

Show more replies

Join us for our next monthly Adobe Target Community Q&A Coffee Break,

taking place Wednesday, October 14th @ 8am PT

👨‍💻☕👩‍💻Register Now!👨‍💻☕👩‍💻

We'll be joined by Jon Tehero aka @jontehero, Group Product Manager for Adobe Target, who will be signed in here to the Adobe Target Community to chat directly with you on this thread about your Adobe Target questions pertaining to his areas of expertise:

Sign up

Login with SSO

Login to the community

Login with SSO