Expand my Community achievements bar.

SOLVED

AEP Data Lake Retention

Avatar

Community Advisor

Hi All,

Has anyone had success applying retention to the AEP Data Lake. Not TTL from a dataset perspective, but at the DL level itself. A successful POC for us would be the ability to delete an old batch.

Would the best way, be to use the following: 

https://developer.adobe.com/experience-platform-apis/references/catalog/#operation/listBatches

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Level 2

Hi David,

 

You are right. First, you have to list past batches. You would need to order that list by timestamp desc, in order to have the last one and going through the past ones. Then, you could use API services to delete specific batch's. 

 

If you are looking to do it for data cleansing in the data storage, once an Adobe Support Architec told me that this storage only counts for what is in the profile store and not data like. Now the only benefit I'll see is to decrease the processing time when querying a dataset. 

 

Best,

Renato 

View solution in original post

8 Replies

Avatar

Correct answer by
Level 2

Hi David,

 

You are right. First, you have to list past batches. You would need to order that list by timestamp desc, in order to have the last one and going through the past ones. Then, you could use API services to delete specific batch's. 

 

If you are looking to do it for data cleansing in the data storage, once an Adobe Support Architec told me that this storage only counts for what is in the profile store and not data like. Now the only benefit I'll see is to decrease the processing time when querying a dataset. 

 

Best,

Renato 

Avatar

Community Advisor

Hi @renatoz28 

Thank you for the response. Do you have documentation regarding storage in the profile store vs datalake? I wasn't aware of that. 

Avatar

Community Advisor

Hey @brekrut do you happen to know of any documentation on this topic?

Avatar

Employee

Hi @DavidRoss91 

 

If you are looking to delete a specific batch from the datalake, yes you are correct you would us the Batch API to query the batches in question and remove this from the AEP datalake.

 

https://developer.adobe.com/experience-platform-apis/references/catalog/#tag/Batches

 

The data in the datalake is available from a metric POV as reference here.

https://experienceleague.adobe.com/en/docs/experience-platform/dashboards/guides/license-usage#avail...

 

@DavidRoss91  What is the use case you are attempting to solve with data retention in the datalake?

Avatar

Community Advisor

Thanks @brekrut - my understanding of the use case is to potentially highlight a retention policy within AEP data lake. The same way TTL's work for datasets, trying to understand if there is something that can be done from the data lake perspective as well.

Avatar

Employee

Hi @DavidRoss91 

 

The TTL policy is for the profile store.  The datalake is for dataSets which are and are not enabled for profile.  The amount of data retained in the Adobe datalake would be shown within the license metric DataLake storage.

 

https://experienceleague.adobe.com/en/docs/experience-platform/dashboards/guides/license-usage#avail...

Avatar

Community Advisor

@brekrut got it makes sense, so only way to manage the data in the lake is essentially to delete via batch as mentioned above?

Avatar

Employee

@DavidRoss91 

 

Correct, the datalake will retain data which is ingested into the platform. It then comes down to how much data is stored upon disk.