Hello,
I’m currently working on importing a large number of lead records into Adobe Experience Platform (AEP). Our current AEP already holds 30 million profiles, and I need to import an additional 10 million leads. These new records are not all new; some will likely merge with the existing profiles, while others will create new profiles.
Here’s my current plan:
Pre-Processing:
- Ensure Data Format: Make sure all data fields match the AEP schema. For example, validate email addresses, phone numbers, etc.
- Normalize Data Fields: Consistently format each data field. For example, standardize phone numbers to a single format like "+1-555-555-5555".
Ingestion Plan:
- Split Files: Break the dataset into smaller files of around 5MB each.
- Batch Ingestion API: Start with 5 parallel requests and adjust based on performance.
I expect AEP to handle the following after ingestion:
- Remove any clear fake or invalid leads.
- Normalize the data to align with existing profiles.
- Stitch related information into unique profiles (Identity resolution).
- Merge profiles as detected during the import process.
- Provide both summarized and detailed reports of the process.
- For each:
- New profile
- Update to an existing profile
- Merge of 2 or more profiles
I want AEP to fire a webhook (I/O Event) that will be added to a Queue (that will be consumed by Salesforce as fast as Salesforce can process this events)
I’m looking for tips, ideas, and validation of this strategy. Does my approach make sense, or are there better practices I should consider?
Thanks in advance!