Hi @ReshmikaPu
Batches that sit in loading for days/weeks will never self‑recover – you must treat them as failed, and you need your own guardrail on top of the standard Flow/Alert rules to catch them early.
1. Understand what loading really means
For batch ingestion, loading means:
- Files were staged, but
- The batch was never promoted into the Data Lake (no ing_load_success or ing_load_failure event, status never moves to staging/success/failure).
In other words, a batch that sits in loading for hours+ is effectively a failed batch, just without the "failed" flag yet. See batch lifecycle and statuses in Batch Ingestion API Overview and the monitoring guidance in Retrieving data ingestion error diagnostics. https://experienceleague.adobe.com/en/docs/experience-platform/ingestion/batch/overview https://experienceleague.adobe.com/en/docs/experience-platform/ingestion/quality/error-diagnostics
2. Use product alerts where they apply (Flow‑level)
For your Azure Blob Source dataflow you should still enable:
- Sources Flow Run Delay
- Sources Flow Run Failure
- Sources Ingestion Error Rate Exceeded
from Administration > Alerts as described in https://experienceleague.adobe.com/en/docs/experience-platform/observability/alerts/overview
These will catch most connector‑level failures (wrong folder, credentials, mapping, etc.), but they do not detect every case where the Catalog batch gets stuck in loading.
3. Add a small, custom "stuck loading" monitor (this is the key)
This is the sustainable, long‑term pattern that actually closes the gap:
Option A – Easiest: periodic Catalog scan
- On a schedule (e.g. every 10–15 minutes), call Catalog Batches API: GET /data/foundation/catalog/batches?property=status==loading&property=created<={now-2h} (filter by datasetId or sandboxId if needed)
- Treat any batch that:
- has status = "loading"
- and created older than your SLA (e.g. 60–120 min)
- as "stuck" and:
- send an internal alert (email/Slack/Teams, etc.)
- log {imsOrgId, sandboxName, datasetId, batchId} for triage
Option B – More advanced: use ingestion notification events
If you prefer an event‑driven approach:
- In Adobe Developer Console, subscribe your webhook to data ingestion notification events:
- In your webhook logic:
- For each new ing_load_* event, store {batchId, datasetId, createdAt}.
- If no ing_load_success or ing_load_failure arrives for that batchId within your SLA window (e.g. 60–120 minutes), mark it as stuck in loading and alert.
- Optionally confirm with a Catalog call (GET /catalog/batches/{batchId}) before paging anyone.
4. Hard truth about the existing two‑month‑old loading batches
Those specific batches will not progress by themselves:
Given your consent‑data constraints, it's reasonable to not re‑ingest them now; instead, put the monitor in place so that future stuck batches are detected within minutes/hours, not months.