Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

FridayFinding | 3 | XDM, schema, and datasets

NimashaJain
Community Manager
Community Manager

Hey folks,

Did you get your sandboxes up and enabled? If you hadn’t, get it done by this post.

Standardization and interoperability are key concepts behind Adobe Experience Platform. To ensure this we have XDM, schemas, datasets and mixins. Let’s understand them as

Experience Data Model (XDM) Experience Data Model (XDM) is an open-source framework that uses standard schemas to unify data for use with Experience Platform and Adobe Experience Cloud applications. XDM standardizes how data is structured and speeds up and simplifies the process of gaining insights from massive amounts of data.

Schema: A schema is a set of rules that represent and validate the structure and format of data. A schema is comprised of a class and optional field group(s) and is used to create datasets and datastreams.

Data stream: A data stream is a set or collection of messages which share the same schema and are sent by the same source.

Dataset: A dataset is a storage and management construct for a collection of data, typically a table, that contains a schema (columns) and fields (rows).

Schema field group: In Experience Data Model (XDM), a schema field group allows users to extend reusable fields to define one or more attributes intended to be included in a schema. It is mixins.

Let’s relate all of them in “what to do”

XDM is a framework on which schemas would be built for data ingestion. Actual data flow to schemas through datastreams. Schemas are rules on which the dataset would store data in form of tables.

Learn the fundamentals of XDM from this video.

Learn about creating schemas here.

Fit the datasets from here.

When and Why

We rarely eat raw on table, same is with data. Data cannot be ingested raw, it need a format, standardization and cleansing which is achieved by XDM, schemas and datasets. So, once you configure your permissions, and start making XDM and schemas, you would be coming across the questions, what behavior of customer you want to know. Like his personal, preferences or any other details. This knowledge and development of thinking will tickle the marketer in you to know your customer well!

Outcome

You should be able to see the schemas and data streams up and running. Post your dataset snapshot to get closer to winning the exclusive swag.

Golden Nugget from @skandg43264764 

“Datasets can also be created using Query Service
Steps :

  • Run the query with Custom conditions on Data already ingested in one or two or multiple datasets in AEP
  • Then the resulting data can be put in New Dataset using AEP query service

Thus, you can make use of existing data to generate new data as per your use case or requirement

 

Cheers,

Nimasha/Skand

4 Replies
bangar50
Level 7
Level 7

Thanks @NimashaJain  sharing this information and I am doing on my sandbox.

 

Regards,

Sanjay

 

jkm-disco
Level 5
Level 5

As an interesting use case, as seen in the picutre below, you can build multiple datasets from the same schema. This method has pros and cons...

Pros:

  • Easy ability to create merge policies on datasets of the same schema class, because you know the same fields can exist across the datasets.
  • If only a subset of data matching a particular schema is needed for running a service (e.g. Privacy Service or Intelligenece Service) it can be stored in a smaller dataset which will allow the services to run faster.
  • Better access management.

Cons:

  • The only safeguard on what data is permissible in a dataset is the schema it is built on. So if there are two datasets with the same schema, and one is designed to remove particular data (e.g. PII), there is nothing on Adobe's side preventing those fields from being ingested into that dataset. That is, all fields within a schema are permissble no matter the intended use of a dataset.
  • The more datasets that could have valid data for Customer Profile, the more merge policies will need to be maintained.
  • The more datasets that are created, the more ingestion flows that would be needed if granular monitoring was desired. 

Datasets.png

jkm-disco
Level 5
Level 5

Another interesting find: Though there isn't mention of it in the documentation, careful with the frequency of using the Catalog Service API for creating datasets. If you attempt to create too many at once, your IP address can become blocked from that particular resource.