Are there any Dataset size limitations? and is there any impact of huge data size | Community
Skip to main content
Level 2
May 9, 2023
Solved

Are there any Dataset size limitations? and is there any impact of huge data size

  • May 9, 2023
  • 2 replies
  • 1421 views

We are working on single schema across multiple digital properties and have single dataset created for these properties. Wanted to know if all data in single dataset can lead to any concerns?

 

1. Are there any Dataset size limitations per dataset vs combined all datasets together?

2. Is there any impact of huge data size in these datasets on any other components like Querying the data, sending data to data streams, AEP CDP etc. or any performance concerns?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by arijitg

@parm2 There are certain profile guardrails that @arpan-garg shared above and there are certain ingestion guardrails (Guardrails for Data Ingestion | Adobe Experience Platform) that you can check but no such hard guardrail on dataset size as of now.

Though we are not aware about your data or use case but best practice is to split dataset based on sources (rather than dumping all data in a single dataset). This helps in query/analysis and troubleshooting, also in case you need to clean up or reload any dataset this design comes handy.

2 replies

arpan-garg
Community Advisor
Community Advisor
May 9, 2023

Hi @parm2 - Here you can find the default guardrails for real time profile data

https://experienceleague.adobe.com/docs/experience-platform/profile/guardrails.html%3Flang%3Dde

 

I hope this helps. 

arijitg
Adobe Employee
arijitgAdobe EmployeeAccepted solution
Adobe Employee
May 10, 2023

@parm2 There are certain profile guardrails that @arpan-garg shared above and there are certain ingestion guardrails (Guardrails for Data Ingestion | Adobe Experience Platform) that you can check but no such hard guardrail on dataset size as of now.

Though we are not aware about your data or use case but best practice is to split dataset based on sources (rather than dumping all data in a single dataset). This helps in query/analysis and troubleshooting, also in case you need to clean up or reload any dataset this design comes handy.