Expand my Community achievements bar.

FridayFinding | 4 | Data Ingestion - Sources; data stream

Avatar

Employee

Hey folks,

Last Friday, we connected XDM, Schemas via data stream via this post. Reiterating, XDM is a framework on which schemas would be built for data ingestion. Actual data flow to schemas through DataStreams.  Schemas are rules on which the dataset would store data in form of tables. This Friday we would see more of Data ingestion from sources and getting the DataStreams.

Source: A source is a general term for any input connector in Platform. Sources in Experience Platform are Adobe applications, advertising, Cloud storage, CRMs, Customer Success, Databases, payments, streaming and protocols.

Source connector: Source connectors (also known as sources) help users easily ingest data from multiple sources, allowing the structuring, labelling and enhancement of data using Experience Platform services. Data can be ingested from a variety of sources such as cloud-based storage, third party software, and CRM systems

Data ingestion: Data ingestion is the process of adding data from a source to Experience Platform. Data can be ingested into Platform in several ways including streaming, batches, or added via source connectors.

Let’s relate all of them in “what to do”

Use Experience Platform to centralize data and collect from disparate sources. This data is further going to AAM, analytics and modelling to give insights about the customer behaviour.

Learn about the various connectors from here.

Learn about the API tutorial about the source connectors here.

Learn about data ingestion from here.

When and why

Data is ingested either by streaming or batch depending on your requirements. Streaming data ingestion is implemented through Launch and batch data ingestion could be done even simply by using an Adobe Experience Platform Workflow to take a CSV-file, map it against an XDM-schema and then ingest it into Adobe Experience Platform. So after you figure what kind of customer preferences you would like know and personalize their experience, you would create schemas to take snapshot at the particular data . Next thing would be configuring data sources for ingestion.

Outcome

Once schemas are defined, data connector are bundled properly in data streams, you would be able to see the data flowing the profile and events.

Golden Nugget from @Ankita_Sodhi 

"There is some expected latency for each source connector. So, plan your use-case considering those delays.

Details on the expected latency for each connector is mentioned the product documentation.”

 

Cheers,

Nimasha/Ankita

 

 

 

3 Replies

Avatar

Level 6

Found some interesting things using the streaming, HTTP API, connector:

  1. It seems that all you need is a connection, found in the "Accounts" tab in order to set up a working inlet URL to send data.
  2. Related to the above, the benefit of creating the flows, in the "Dataflows" tab, appears to be for moinitoring batches in a more organized way. So it is easier to see which source is contributing a particular batch.
  3. The ability to delete "Account" connections has not been built out even at the API level. Though a DELETE request is possible, it simply removes the connection from the interface, but the system still has knowledge of it since it won't let you create a new connection with the same name.
  4. When sending the data, there is an option to specify the dataflow name, but it seems that this is still somewhat superfiical since valid data will make it to the specified Dataset no matter which flow name is included.
  5. AEP API is very flexible about how to define flows: They can be created incrementally with source and destination connections specified separately, or you can define during initialization the target dataset.
  6. Haven't tried it, but it looks like the API may be able to support multiple target datasets if you want to group the monitoring of several flows.

Avatar

Employee

Thanks @Jacob-DDdev for great observations.

Apart from connection, there is need to setup XDM schemas and data streams for data flow. Account profiles enable you to unify account information from multiple sources. This unified view of an account brings together data from across your many marketing channels and the diverse systems that your organization is currently using to store customer account information.

 

In observation 4, does it mean during 'sending data to specific destination' , dataflow name is superficial?

 

It would be great if you add observation 3 into ideas so that PM team can have look into it.

 

Cheers,

Nimasha

Avatar

Level 6

With regards to observation 4, when constructing the XDM for HTTP API (stream) ingestion, there are two fields that seem to be optional and don't serve any purpose:

1. There is a "flowId" field that can be included, ref, but that same field isn't specified in the API request body documentation, here.

2. There is a field in the request body schema called SOURCE_NAME which is optional, but it seems this value can be set to anything without impacting the ability to ingest the data.

 

For observation 3, I'll create an idea for it.