List all batch ids of a dataset based on Source Data flow run id | Community
Skip to main content
Level 3
September 10, 2025
Question

List all batch ids of a dataset based on Source Data flow run id

  • September 10, 2025
  • 2 replies
  • 559 views

Hello Community,

I am working with a dataset in AEP and need to trace all the batch IDs that were generated during a particular ingestion. I can see the source Dataflow Run ID has ingested some records in the dataset and took 1.3 hours run but I see during this time multiple batch have been created in dataset.

 

I tried with APIs with the following ways: 

  1. Flow services run id the payload do not provide the dataset details
  2. Catalogue services batches with bath id, here as well the Data flow run details are not available

 

My requirement is:

  • Given a source Dataflow Run ID, I want to list all the batch IDs created in AEP.

What is the correct way (via API or Query Service) to fetch all batch IDs of a dataset based on a given Source Dataflow Run ID?

 

2 replies

Devyendar
Level 6
September 10, 2025

Hi @mustufam5967803 

 

Does your dataset receive data from multiple dataflows, and you’re trying to identify batchIds for each dataflow? As far as I know, this isn’t directly possible.

If you only need to see the different batchIds loading data into a dataset, you can query them through Query Service using _acp_system_metadata.acp_sourceBatchId in your SELECT statement.
Alternatively, you can use the Catalog Service API for batches: https://platform.adobe.io/data/foundation/catalog/batches.

If you specifically want to tie batches to individual dataflows, you’ll need a custom approach:

  • Update your schema to capture the dataflowName.

  • In each dataflow mapping, pass a unique name for the dataflowName .

  • Then, using Query Service, select both the dataflowName field and _acp_system_metadata.acp_sourceBatchId to see which batches came from which dataflow for a dataset.

 

Level 3
September 11, 2025

We are using streaming connections, and the connection takes around 1.3 hours to run. During this duration, 2–3 batches get created.

With  _acp_system_metadata.acp_sourceBatchId, I tried this approach, but I am not sure what should be added in the WHERE condition to filter the data based on the Data Flow run ID. Also, could you let me know which system dataset contains this data?

It also gets difficult to trace back the batch ID of the past 2–3 days if you only have the Data Flow run ID.

Sukrity_Wadhwa
Community Manager
Community Manager
October 17, 2025

Hi @devyendar,

Can you please help @mustufam5967803 further with their query?

Thanks!

Sukrity Wadhwa
Adobe Employee
September 11, 2025

If you look on the Datasets page for your dataset at the bottom you'll see Dataset Run IDs and their corresponding Batch IDs. For batch source connectors these are 1:1.