Expand my Community achievements bar.

Webinar: Adobe Customer Journey Analytics Product Innovations: A Quarterly Overview. Come learn for the Adobe Analytics Product team who will be covering AJO reporting, Graph-based Stitching, guided analysis for CJA, and more!

Delete data in Adobe Analytics

Avatar

Level 2

10/13/16

In case bugs happen on the web page events and variables in Adobe can be affected, especially with "wrong" data. As an example: you are measuring error events for a webpage component and due to a bug all events are fired 10x instead of once. In all reports this massive spike is now messing up averages, etc

 

You could deal with that by using filters, but the more users have acces to the raw data the harder this is as you can't enforce segments or filters to all users. You could also set calendar events, but those only show up in charts, not in tabular analysis.

 

Therefore we need the capability of deleting data from individual evars and props for selected time frames.

14 Comments

Avatar

Community Advisor

10/13/16

This is big problem for our business.

 

Both for the reasons already mentioned and also in the case where we want to repurpose a variable

 

If we repupose a variable and it still contains data relating to the old purpose it's very confusing for users and can lead to issues.

 

Around 40% of our variables are currently tied up containing data that we no longer require and we are waiting to repurpose.

Avatar

Employee

10/13/16

Thanks for submitting this idea, @Creativebyte, and for commenting, @AndyW. I can absolutely see how this could be a problem. I'm hopeful that we can solve this in the future. In the mean time, we'll keep an eye on this idea for votes and comments. 

 

cc: @tpaulsen_PM @mfreestone

Avatar

Level 6

10/14/16

as for re-purposing the variable besides the fact that Adobe has taken eVars from 75 to 100 in the past couple of years and you can go to 250 with any of the premium solutions sometimes it's good the re-use a variable of value.

One way we have found is first to make sure the variable is not used for some period of time. (we like to use a fiscal year plus 1 month so we can do year over year reports eventually)

Make sure you allocate a prop to contain your S_code version which should include the Adobe version and your own local edit version using dates, release numbers etc) so you can have a version number where this new variable’s purpose starts (and make sure to do major/minor release numbers etc so you can make sure you know all the s_code versions where this variable has the re-purposed value).

In the admin console make sure to check the ‘reset’ box and select ‘reset’ in the dropdown when you re-define that will have the back-end delete all the current values for that eVar for all customer ID’s. [you don’t have this option for props because they have no context and expire on the next page-view]

Then you have a bunch of choices. You can create a segment that will include all data for when the evar had the old name/value and one for the new name/value and you can create virtual report suites for the old usage and the new usage (which lets you close that “don’t use this eVar for a year window”).

Also you can create a processing rule that if some old crufty version of your s_code got pulled into someone’s cache or copied and is loaded directly from a local storage rather than your master copy if you can identify what is valid data for that eVar going forward and if the data coming in doesn’t match what it should be then you can delete the value in the current server call since that isn’t the type of data you want.

Also with 1000 events you can probably spare a couple to fire when this happens (ie old/bad data for the evar appears, fire an event in a processing rule and delete the value of the evar (or copy it to some debug eVar) and then you can go back and attempt to find where the old information is coming from pulling url’s, s_code versions etc.

I do think attempting to actually delete information good or bad is not a practice that 99% of analytics experts should get involved with. Do you do anything with “Data Sources”? They do give you some limited ability to upload data (or change existing data). So the best example is that if you have a purchaseID and saw revenue come in but the purchase was returned and you gave a refund. Say the purchase was $100.00. You can use data sources to take that unique purchaseID and change revenue to -100.00 to zero it out. Or add a new event called ‘refunds’ and give it a value of $100.00. Then in your reports you can show revenue, returns and a calculated metric with revenue-returns to get adjusted revenue.

Adam Greco has a number of  these examples in his book and blog entries. These sorts of examples aside it’s not the easiest thing to setup and get correct.

Obviously asking Adobe to clear out all data in eVar23, eVar27, Prop14, Event15 and Event27 before 7/1/2016 would be interesting capability I think the workarounds using calendars, segments, processing rules, virtual report suites etc are a lot easier to implement and understand and the last thing you want after working for some time to get Adobe to delete a bunch of values for a variable is to then have someone come back and say ‘Hey, we really need that pre-2016 data for eVar27’

Avatar

Level 2

10/16/16

Hi warrensander,

 

Thanks for your comment. Please know that I do not want the delte capability to delte data and then repurpose the variable. I need it to delete "wrong" data that was flowing in because of bugs in the programming. If I was a single analyst I simply could setup segments or filters to get rid of that bad data spikes, but I am not and ADobe Analytics has no way of introducing forced segments.

 

I understand you are hesitant to delete data - too easy to also delete unwanted data - but I see no other way. Unfortunately your correction method is not working either as I would have bad data spike on one day and another bad data spike the other. In total they would cancel each other out, but the individual days would both be wrong and would cause all kinds of questions when looking at the carts.

 

A feature like this needs to be treatet with care - admin only, logs of what has been done - but I think it is needed.

Avatar

Level 6

10/17/16

Hi Creativebyte,

 

you can sort of do 'forced' segments with virtual report suites since those are based on segments to begin with. and I understand being over worked and under staffed to try to fix large amounts of bad data.

 

You should look at Data Sources because this does allow you to delete (zero out is probably a better word for it) bad data in events assuming you can actually find it.

 

We had an issue with an underflow where an event got translated into hex FFFFFFFFF... which ended up being 4 Billion (32 bits) and we had to back that purchase out with Data Sources. had to actually do it in several stages because I couldn't replicate the exact number so I had to kill as much as possible and then kill the leftovers.

 

But I learned from that incident that I REALLY didn't want to do at lot with Data Sources because that can really hurt your data.

 

That said it is a good way to back-out event data especially currency data that is not correct (revenue comes in and it's supposed to be dollars but it's actually yen and the currencycode didn't get set to YEN to allow for adobe to translate it during processing. etc)

 

For bad data in a prop or evar if you can isolate the bad data it's format etc you can use a processing rule to delete it (or move it to another variable) before it gets processed. and for already processed data you can always classify the data and have 'good data' 'bad data' classifications or copy the key value (what came in via the s-code) to a classified value and ~empty~ the bad values (or set them to "Don't use this data" etc.

 

The big problem with deleting data is do you delete the entire server call when evar1="bad data" or do you just delete evar1 and leave everything else. If you are thinking about just deleting that one evar value it might be better to classify it out. To delete the entire server call you would need adobe to impliment a 'delete this server call' type of a process.

 

I would like this in processing rules anyway so even if I have to pay for the server call it would just be nice to be able to 'delete/suppress' everything in this server call because of some setting.

 

so I'm with you both ways. I wish we never had to do it but understand that sometimes you need to get rid of bad data. It's just how you can go about doing it in a fast efficent manner.

Avatar

Level 2

10/17/16

Hi warrensander,

 

Thank you very much for your insights and best practices. Using processing rules and data sources seems to be a good way to "correct" bad data, so we'll look into that instead of persuing a one-click-delete-button solution.

 

Creating additional virtual workspaces with enforced segments looks interesting, not sure however how flexible that is in terms of adding new segments if needed.

 

We are of course setting calendar events for bad data, however this is only shown in charts and a lot of people work with Workspace data tables without any visualization. Maybe something for another Adobe suggestion to "fix" this...

 

Thanks again for your valueable insights!

Avatar

Level 1

10/18/16

Hi warrensander,

 

You are writing "To delete the entire server call you would need adobe to impliment a 'delete this server call' type of a process." Is this something you have ever done? Or is this just something hypothetical?

 

Thanks,
Julian

Avatar

Employee Advisor

10/19/16

So one idea would be to leverage a hit-based segment in order to remove a specific set of hits from a report suite, then apply that segment to a VRS. Obviously that's different from completely deleting data from reports, but as mentioned above, that can get hairy.

 

I tried this in a test report suite (I wanted to remove hits from 11am on October 6th) and it seemed to work well! It's obviously not perfect, since it removes the entire hit, any evars/props that were set in that hit will also be removed, but it's another option to throw in here. I added a screenshot of my "exclude" segment if you're interested.

 

Screen Shot 2016-10-19 at 11.16.08 AM.png

Avatar

Level 2

10/19/16

Hi Eric,

 

I am still unexperienced with the VRS, but I can imagine that setting this up, granting access to  the VRS to all relevant users and then convince (force?) everyone to use the new VRS instead of what they are used to can get both very time consuming and error prone.

 

Do you have any experience with that?

Avatar

Level 6

10/19/16

Julian, no have I haven't put in an idea here to allow a processing rule to delete everything. I was sort of hoping that Adobe would by themselves when they finish processing rules so that I can get to s.products etc that they would add a 'delete this server call' option.

 

Then again, if they did it and didn't make me pay for the server call that would be even better. Or route the server call to a different report suite (ie move the bad data calls off to a 'bad data' report suite) to help with the debugging process.

 

I do think a suite of tools to allow shifting server calls  to a different report suite via processing rules, delete the entire server call via a processing rule and delete the entire server call and don't charge me for the call (would be best).

 

I would not have an issue if I could delete the server call and not get charged to flag adobe to contact me about why and how much I am trying to not get charged for and make it a limited function. I had a change happen to my site in July that added 80M server calls/month and if we decide to abort those calls it would have been nice to be able to kill them via a processing rule. We can't do that now so we had to add a function to our code to kill some of these custom links that got added by accident until the app developers can fix and re-load their apps.