Ingest parquet file example | Community
Skip to main content
DavidSlaw1
May 13, 2024
Solved

Ingest parquet file example

  • May 13, 2024
  • 1 reply
  • 4569 views

Does anyone have any example of ingesting a parquet file into AEP?  Documentation and Adobe Support says the parquet file must exactly match the XDM schema in AEP.  I am have a test file created to do just that.   However, there has to be a way to ingest a parquet file and use a mapping set.  I prefer to ingest data from client systems without requiring the client to create a specific format just for Adobe.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by DavidSlaw1

There must be a way to ingest data from a parquet file that is not XDM compliant.  It seems silly to ask a client to reformat / restructure data just for AEP to consume it.


Solved.  There is no way to ingest non-xdm compliant parquet.  I solved by making the parquet xdm compliant using a data pipeline, ensuring the datetime values are of a format AEP can ingest without errors, and sizing the data files to optimize load time.  Parquet file loads 60% faster compared to csv load time. 

1 reply

Tof_Jossic
Adobe Employee
Adobe Employee
May 14, 2024

@davidslaw1 not sure if I've got this right but I assume you are possibly ingesting the data using drag and drop on the dataset UI page, that does not allow for any mapping options.

However most of the Cloud Storage options  would let you select the parquet format in your dedicated repository and then go through the 'Mapping' step.

 

See Map data fields to an XDM schema

 

Let me know if that helps

 

 

 

 

DavidSlaw1
May 14, 2024

Does not help.  selecting a parquet file from S3 is fine.  No options to map in the UI workflow.

September 16, 2024

Solved.  There is no way to ingest non-xdm compliant parquet.  I solved by making the parquet xdm compliant using a data pipeline, ensuring the datetime values are of a format AEP can ingest without errors, and sizing the data files to optimize load time.  Parquet file loads 60% faster compared to csv load time. 


@davidslaw1 how did you create XDM complaint parquet structure?