Expand my Community achievements bar.

SOLVED

Query on Splitting Large Data Exports into Batches

Avatar

Level 2

Hi All

I have a question regarding my requirements. I have implemented an ETL technical workflow that exports data in CSV format to our downstream applications.

My requirement is to split the response data into multiple batch files. For instance, sometimes we target more than 10 million individuals in an hour. When the broadLogData is exported, it includes all 10 million responses in a file, which takes a significant amount of time and results in a large file. What I would like to achieve is that if the export data exceeds 2.5 million, it should be split into four batch files, each containing 2.5 million records.

Could you help me understand how to achieve this, either through code or if there are any out-of-the-box options available for this?

Thank you.

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hello @Prady12  You can do something like this:

 

_Manoj_Kumar__3-1747115984915.png

 

 

1. Query all records

2. Use test to check if the count is greater than 2.5m If no, export them in single batch

3. Otherwise use split to take out 2.5 m for export and create a loop

4. Adjust the wait time based on the time it takes to export 1st file of 2.5m records. Eg if it takes 15 mins then add wait for 25 or 30 mins

 

Configuration of test

_Manoj_Kumar__1-1747115877913.png

 

Configuration of Split:

_Manoj_Kumar__2-1747115898366.png

 

 

Note: Make sure of add seconds in the file name otherwise it will overwrite the previous file.

 

 

 

 


     Manoj
     Find me on LinkedIn

View solution in original post

4 Replies

Avatar

Community Advisor

Hi @Prady12 ,

Use Split activity and create multiple subsets. And in each Subset, Limit the count to 2.5M

ParthaSarathy_1-1747114870964.png

Avatar

Level 2

Thank you for quick response. And Yes if target data is known and fixed we can implement this OOB solutions. We can use exact number or we can go for percentage of size well .

Avatar

Correct answer by
Community Advisor

Hello @Prady12  You can do something like this:

 

_Manoj_Kumar__3-1747115984915.png

 

 

1. Query all records

2. Use test to check if the count is greater than 2.5m If no, export them in single batch

3. Otherwise use split to take out 2.5 m for export and create a loop

4. Adjust the wait time based on the time it takes to export 1st file of 2.5m records. Eg if it takes 15 mins then add wait for 25 or 30 mins

 

Configuration of test

_Manoj_Kumar__1-1747115877913.png

 

Configuration of Split:

_Manoj_Kumar__2-1747115898366.png

 

 

Note: Make sure of add seconds in the file name otherwise it will overwrite the previous file.

 

 

 

 


     Manoj
     Find me on LinkedIn

Avatar

Level 2

Thank you for quick response . Yes this solutions is close to my requirement. I will try to implement this for my requirement .