Hello Community,
I am working with a dataset in AEP and need to trace all the batch IDs that were generated during a particular ingestion. I can see the source Dataflow Run ID has ingested some records in the dataset and took 1.3 hours run but I see during this time multiple batch have been created in dataset.
I tried with APIs with the following ways:
My requirement is:
Given a source Dataflow Run ID, I want to list all the batch IDs created in AEP.
What is the correct way (via API or Query Service) to fetch all batch IDs of a dataset based on a given Source Dataflow Run ID?
Does your dataset receive data from multiple dataflows, and you’re trying to identify batchIds for each dataflow? As far as I know, this isn’t directly possible.
If you only need to see the different batchIds loading data into a dataset, you can query them through Query Service using _acp_system_metadata.acp_sourceBatchId in your SELECT statement.
Alternatively, you can use the Catalog Service API for batches: https://platform.adobe.io/data/foundation/catalog/batches.
If you specifically want to tie batches to individual dataflows, you’ll need a custom approach:
Update your schema to capture the dataflowName.
In each dataflow mapping, pass a unique name for the dataflowName .
Then, using Query Service, select both the dataflowName field and _acp_system_metadata.acp_sourceBatchId to see which batches came from which dataflow for a dataset.
Views
Replies
Total Likes
We are using streaming connections, and the connection takes around 1.3 hours to run. During this duration, 2–3 batches get created.
With _acp_system_metadata.acp_sourceBatchId, I tried this approach, but I am not sure what should be added in the WHERE condition to filter the data based on the Data Flow run ID. Also, could you let me know which system dataset contains this data?
It also gets difficult to trace back the batch ID of the past 2–3 days if you only have the Data Flow run ID.
Views
Replies
Total Likes
If you look on the Datasets page for your dataset at the bottom you'll see Dataset Run IDs and their corresponding Batch IDs. For batch source connectors these are 1:1.
Views
Replies
Total Likes
Views
Likes
Replies
Views
Like
Replies