Lookup AEP Field Usage
Description:
When managing XDM schemas across multiple sandboxes, there is currently no way to determine which assets reference a specific field before making changes. If a field was created with the wrong data type (e.g., number in one sandbox, string in another), there's no built-in way to assess the blast radius of fixing it.
The Problem:
Today, if I need to deprecate or replace a schema field, I have to manually query multiple APIs to find references:
- Export all segment definitions and parse PQL expressions for the field path
- Query the Catalog API to find datasets tied to the schema
- Export computed attribute definitions and search those too
- Check dataflows via the Flow Service API
There is no unified view, no reverse-index on field paths, and no UI support for this workflow. For organizations with hundreds of segments and dozens of schemas, this is a significant operational gap — especially when a schema inconsistency needs to be corrected urgently.
Proposed Solution:
Add a Field Usage Analysis feature accessible from the Schema Editor UI and via API:
- UI: When viewing a field in the Schema Editor, add a "View Usage" option that shows all assets referencing that field — segments, datasets, computed attributes, destinations, and dataflows
- API: A new endpoint (e.g., GET /schemaregistry/field-usage?fieldPath=person.birthDate) that returns a structured report of all references across asset types
- Pre-change validation: When a field is being deprecated or a schema is being modified, surface a warning with the list of impacted assets before the change is applied
Use Case:
A schema field was created as number type in the dev sandbox but string type in production — clearly done in error during initial setup. Before creating a replacement field and migrating audiences, I need to know exactly which segments, computed attributes, and datasets reference this field so nothing breaks silently. Today this requires custom scripting against 4+ separate APIs with manual PQL parsing.
Business Value:
- Reduces risk of breaking production audiences and data pipelines during schema maintenance
- Saves hours of manual API querying and scripting for what should be a basic platform capability
- Improves confidence in schema governance, especially for organizations managing multiple sandboxes
- Aligns with the data lineage capabilities already present in Data Governance, extended to field-level granularity