Expand my Community achievements bar.

SOLVED

Accessing data , Data retention and data cleanup

Avatar

Community Advisor

Hello,

 

I have a few questions regarding accessing data through the Profile API, data retention, and data cleanup.

1. I have recently ingested data into AEP using the streaming API and I am currently accessing it through the Real-Time Profile API. I understand that the data is stored in the profile store for 7 days, but I am unsure if the Profile API will still provide a response after this period. Could you please clarify this for me?

2. Additionally, I am curious about the length of time data remains in the data-lake before Adobe archives it. Should I be concerned about backing up the data I have ingested into AEP, or does it remain there indefinitely? This document says that data stays for 7 days https://experienceleague.adobe.com/docs/experience-cloud-kcs_en/kbarticles/KA-19958.html?lang=en

3.Finally, I would like to know how to delete event data in the data lake that is no longer relevant. Can you provide some guidance on this?

 

Thank you for your time and assistance,

Arpan

1 Accepted Solution

Avatar

Correct answer by
Moderator

Hi @arpan-garg 

 

1. If you didn't enable any TTL window of 7 days you should still be able to get the response, though new sandboxes comes with a default TTL period of 30 Days so after that period data will be removed.

2. Data Lake is meant as a temporary data zone to get the data to Profile and is meant to contain data only for 7 days with the sole purpose of preparing the data to be ingested into Profile. However, there are no enforced guardrails around this limit. Data is neither deleted nor made inaccessible after 7 days.

3. You can delete data from data-lake on batch level (not possible on record level) you can list out old batches which are not required anymore and use batch deletion API. Below is a reference guide -->

 

Batch Ingestion API Guide | Adobe Experience Platform

View solution in original post

14 Replies

Avatar

Correct answer by
Moderator

Hi @arpan-garg 

 

1. If you didn't enable any TTL window of 7 days you should still be able to get the response, though new sandboxes comes with a default TTL period of 30 Days so after that period data will be removed.

2. Data Lake is meant as a temporary data zone to get the data to Profile and is meant to contain data only for 7 days with the sole purpose of preparing the data to be ingested into Profile. However, there are no enforced guardrails around this limit. Data is neither deleted nor made inaccessible after 7 days.

3. You can delete data from data-lake on batch level (not possible on record level) you can list out old batches which are not required anymore and use batch deletion API. Below is a reference guide -->

 

Batch Ingestion API Guide | Adobe Experience Platform

Avatar

Community Advisor

Hi @arijitg

 

 

 

1. Where can I see and set the TTL? If the time allocated in TTL is reached does it mean I can't access the profile via profile API ? I was assuming that profile API will always give the response irrespective of how old is data.

2. Is there a way to delete the data which is not relevant for a business usecase. For example delete all the events which has timestamp before a year.

 

 

Avatar

Moderator

Hi @arpan-garg 

 

1. TTL can be set by product team if you raise a request to them, you can't set/access it directly. I think default TTL window is mentioned when license is provisioned else you can raise a request to verify. TTL only removes events from profile store so if you've profile data you can still access it but you won't be able to see any event data associated, they'll be removed in rolling manner.

 

2. For data lake "No" (TTL will remove events from profile store but not from data lake)

Avatar

Community Advisor

Is there a way to delete data with scheduled queries? 

Avatar

Moderator

No @arpan-garg , you can schedule queries to override profile attributes with Null and make them empty but you can't delete them. 

Avatar

Community Advisor

Thanks @arijitg Currently we have a third party app which consumes profile API and get the details of profile and event via the API. Basically it generates 360 view of the profile and shows it on the app. With your response it seems that event data relayed to profile will not show after 2 months. How to handle this ? 

Avatar

Moderator

@arpan-garg If the default TTL window is set to 2 month then this will happen though you can raise support request with product team to extend the window but this will increase profile richness and may impact your license limit.

Avatar

Community Advisor

Thanks for the info @arijitg , I couldn't find any documents related to this. Can you please let me know if you know of any document citing this limitation.

Avatar

Community Advisor

Hello @arijitg  I have read the document and would like to confirm my understanding with you.

If a TTL (time-to-live) is applied to an Experience Event dataset, any events that are older than the TTL will be deleted and will not appear in the response of Profile APIs.

It's important to note that TTL can only be applied to an Experience Event dataset, and therefore, the Individual profile dataset will not be affected by this.

Can you please confirm if my understanding is correct.

 

Also, i have one more question. TTL will delete the data not only from profile store but also from the dataset right ? I can't access the data anymore after its deleted.

 

Best,

Arpan

Avatar

Moderator

Dear @arpan-garg  your understanding is absolutely correct but TTL deletes data only from profile store it doesn't touch data lake or dataset so you can still query the data.

Avatar

Community Advisor

@arijitg - The document you shared says it differently(Maybe i understood it wrong)

After Experience Event expirations have been enabled on a Profile-enabled dataset, Platform automatically applies the expiration values for each captured event in a two-step process:

  1. All new data that is ingested into the dataset has the expiration value applied at ingestion time based on the event timestamp.
  2. All existing data in the dataset has the expiration value retroactively applied as a one-time backfill system job. Once the expiration value has been placed on the dataset, events that are older than the expiration value will be immediately dropped as soon as the system job runs. All other events will be dropped off as soon as they reach their expiration values from the event timestamp. When all Experience Events have been removed, if the profile no longer has any profile attributes, the profile will no longer exist.

 

To make it more clear lets say i have a dataset 'A' which is enabled for profile. If i have a TTL of 15 days on this dataset. I assume that any event ingested today will stay in dataset for 15 days and after 15 days it will be deleted.

 

My understanding is that after 15 days if I query this dataset I will not get events which happened before 15 days.

 

But as per you i won't get the data in Profile API response after 15 days but in query service i can still see the data after 15 days.

 

Avatar

Community Advisor

Also you can see here https://experienceleaguecommunities.adobe.com/t5/adobe-experience-platform/archiving-old-profiles/m-...   @adobechat mentioned that TTL will delete the data completely from AEP so i am a bit confused.

 

arpang16406580_0-1683632554544.png

 

Avatar

Community Advisor

Hi @arijitg - Got the clarity now, thanks for the info