Expand my Community achievements bar.

SOLVED

Multiple Files with Same File Format but Different File Names – SFTP Ingestion into AEP

Avatar

Level 1

Hi everyone,

I have a scenario where 10 files are available in an SFTP location, and I’d like to load all of them into Adobe Experience Platform (AEP) at once. However, these files don’t all have the same number of fields — some contain additional or fewer fields than others.

My questions are:

  • Can all 10 files be ingested into AEP in a single load?

  • How does AEP handle files with different field counts or mismatched schemas?

  • Is there a best practice or recommended approach for handling such variations during ingestion?

 

 

Thank You

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Level 6

@SiddarthK there are 2 ways to resolve it, but first create a master Schema with all the fields included across the 10 files.

1) Single Dataflow:

  • Standardize all the files and add the columns/fields missing in other files and leave the values for those fields blank.
  • Create a standard data mapping with all the fields from source files to Schema fields.
  • Make sure you don't select the individual files in the Dataflow (for preview), select the folder (that has all the 10 files), you might not get a preview if you select folder but that is fine.

2) Multiple Dataflows:

  • Create 10 different Dataflow each selecting the individual file and mapping the fields available in the file.
  • In this approach, you don't have to standardize all the files and include the missing fields (fields from other files with blank values).

View solution in original post

5 Replies

Avatar

Correct answer by
Level 6

@SiddarthK there are 2 ways to resolve it, but first create a master Schema with all the fields included across the 10 files.

1) Single Dataflow:

  • Standardize all the files and add the columns/fields missing in other files and leave the values for those fields blank.
  • Create a standard data mapping with all the fields from source files to Schema fields.
  • Make sure you don't select the individual files in the Dataflow (for preview), select the folder (that has all the 10 files), you might not get a preview if you select folder but that is fine.

2) Multiple Dataflows:

  • Create 10 different Dataflow each selecting the individual file and mapping the fields available in the file.
  • In this approach, you don't have to standardize all the files and include the missing fields (fields from other files with blank values).

Avatar

Level 6

Hi @SiddarthK if the answer is helpful, can you mark it as "Correct Answer", it would help wider community members.

Avatar

Level 1

Hi @Devyendar if I select folder the "Next" option is disabled.
I don’t think using multiple dataflows is very practical from a business perspective.

Avatar

Level 6

@SiddarthK as I suggested earlier you can consider single or multiple data flow approach.

Single data flow requires you to standardize the files with all having the same column/fields even if they are empty records. This actually might be more challenging aligning with file creator to add those empty fields, but if it is possible for sure take this approach and this works.

Multiple data flow is least disruptive where you take existing files and building the data flow into 1 standard Schema and dataset (depending on you want all files to go into 1 dataset or multiple). And the multiple data flows are internal to AEP there should not be any business impact or considerations.

 

You can choose what works best for your requirements.

Avatar

Level 3

Generally speaking, it's better to ensure that all of your files have the same format, and only source a dataset from one folder in your cloud storage location.

Different datasets should be sourced from different folders, but each folder should be homogenous in terms of its contents.

Then whenever you decide later to add additional attributes to a dataset (schema), create a new folder and start sourcing your files from there.