Hi All,
Has anyone had success applying retention to the AEP Data Lake. Not TTL from a dataset perspective, but at the DL level itself. A successful POC for us would be the ability to delete an old batch.
Would the best way, be to use the following:
https://developer.adobe.com/experience-platform-apis/references/catalog/#operation/listBatches
Solved! Go to Solution.
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
Hi David,
You are right. First, you have to list past batches. You would need to order that list by timestamp desc, in order to have the last one and going through the past ones. Then, you could use API services to delete specific batch's.
If you are looking to do it for data cleansing in the data storage, once an Adobe Support Architec told me that this storage only counts for what is in the profile store and not data like. Now the only benefit I'll see is to decrease the processing time when querying a dataset.
Best,
Renato
Hi David,
You are right. First, you have to list past batches. You would need to order that list by timestamp desc, in order to have the last one and going through the past ones. Then, you could use API services to delete specific batch's.
If you are looking to do it for data cleansing in the data storage, once an Adobe Support Architec told me that this storage only counts for what is in the profile store and not data like. Now the only benefit I'll see is to decrease the processing time when querying a dataset.
Best,
Renato
Hi @renatoz28
Thank you for the response. Do you have documentation regarding storage in the profile store vs datalake? I wasn't aware of that.
Hey @brekrut do you happen to know of any documentation on this topic?
Views
Replies
Total Likes
Hi @DavidRoss91
If you are looking to delete a specific batch from the datalake, yes you are correct you would us the Batch API to query the batches in question and remove this from the AEP datalake.
https://developer.adobe.com/experience-platform-apis/references/catalog/#tag/Batches
The data in the datalake is available from a metric POV as reference here.
@DavidRoss91 What is the use case you are attempting to solve with data retention in the datalake?
Views
Replies
Total Likes
Thanks @brekrut - my understanding of the use case is to potentially highlight a retention policy within AEP data lake. The same way TTL's work for datasets, trying to understand if there is something that can be done from the data lake perspective as well.
Views
Replies
Total Likes
Hi @DavidRoss91
The TTL policy is for the profile store. The datalake is for dataSets which are and are not enabled for profile. The amount of data retained in the Adobe datalake would be shown within the license metric DataLake storage.
Views
Replies
Total Likes
@brekrut got it makes sense, so only way to manage the data in the lake is essentially to delete via batch as mentioned above?
Views
Replies
Total Likes
Correct, the datalake will retain data which is ingested into the platform. It then comes down to how much data is stored upon disk.
Views
Replies
Total Likes