Hello Team,
Welcome to the Adobe Real-Time CDP Community Mentorship Program 2024! This is the featured Community Discussion/Contextual thread for your Adobe Real-Time CDP Community Mentor, Bitun Sen!
Bitun will be your dedicated mentor, providing valuable support and guidance on your Adobe Real-Time CDP queries as you upskill yourself and prepare for Real-Time CDP certification throughout the program.
Know your Mentor Bitun Sen (aka @bitunsen-2022 )
Bitun comes with 2 decades of hands-on experience in working with various master data management and data warehousing applications. Currently, he is involved in providing implementation guidance and support for various customer data platform-specific engagements. Throughout his career, he has trained many co-workers in various tools and technologies including Adobe Experience Platform.
He is looking forward to help others by sharing his knowledge and skills on Adobe Experience Platform as they upskill them selves.
Aspirants mapped to Bitun Sen
1) Dhanesh Sharma aka @dhanesh_sh
2) Kana Nguyen aka @KanaNg
3) Saurabh Channe aka @SaurabhCh
4) Ankit Agarwal aka @ankitagarwal05
5) Najlaa Heerahaka @najlaah
6) Sanchari Das aka @SD_11
7) Sanjay RJ aka @SanjayR_
9) Indra Kumar Reddy Madhuru aka @inder
10) pranathi priya valapa aka @pranathipriya
How to participate in the program
Suggested Next Steps for Aspirants:
Remember that every post / like / comment you make in your contextual thread and the Real-time CDP Community throughout the program helps increase your chance to be recognized by your Mentor and win exclusive Adobe swag, so bring your best efforts!
We wish you all the best as you embark on this learning experience!
What are the allowed schema changes after marking the schema and dataset profile enabled ?
Allowed changes
Breaking changes/Not supported changes
5. What are the 3 types of core entities we can have in AEP?
Views
Replies
Total Likes
4.What are the required fields you have for any experience event schema ?
_id
Timestamp
Team,
Let us start going through the RT-CDP and Data Ingestion in this week. Below are few important links fro reference:
For RT-CDP, please understand
Specifically for data ingestion, please go through the various types of sources we have. Try to understand
I would love to talk to you all before Friday to see if you have any questions about Data Architecture, RT-CDP and Data Ingestion. Would like to know if you all are Ok to meet at 6:00 PM EST on Wednesday. Please let me know, I will schedule it.
Happy Learning!!!!
Hi @Bitun
What is Mapping set ID, in what scenario we create it ? where can we see in UI mapping details ?
As per the documents every dataflow will have Mapping set ID , but I don't see mapping set id for all dataflow
"Mapping set
A set of mappings that transform one schema to another are collectively known as a mapping set. A single mapping set is created as part of each data flow. A mapping set is an integral part of the data flows and is created, edited, and monitored as part of the data flows.
"
We had discussed this in our last call on Wednesday - hope you understood it. If not, we can discuss this again in today's call.
@Indra @SaurabhCh @DhaneshSh - Hope to talk to you in our today's session.
Due to some ongoing project issues, I could not post the questions on RT-CDP and Data Ingestion. Please follow this channel - I will be posting the questions by Sunday (August 4th).
As always, it was great connecting to you guys! As discussed over the call, below are some important information and links you all need to focus:
1. Upsert - https://experienceleague.adobe.com/en/docs/experience-platform/data-prep/upserts
Please note: if you are making any change using UPSERT, that updated data does not flow into Dataset residing in the datalake (please refer to the above picture)
2. Ingest CSV data using Data Ingestion API: https://experienceleague.adobe.com/docs/experience-platform/ingestion/batch/api-overview.html?lang=e...
3. Various batch ingestion troubleshooting points: https://experienceleague.adobe.com/en/docs/experience-platform/ingestion/batch/troubleshooting
4. Various Data Prep functions: https://experienceleague.adobe.com/en/docs/experience-platform/data-prep/home
5. Edge profile vs Hub Profile - how data flows to which data store and when - https://experienceleague.adobe.com/en/docs/experience-platform/profile/edge-profiles
As discussed, I will setup another sync-up call on Wednesday (August 7th) at 6:00 PM EST.
Happy Learning!!!!
Everybody - we had great sessions for last couple of weeks - where we all talked about what we had learnt along with various real-life challenges we face while working with AEP - I can see active participation only from 3 participants:
1) Dhanesh Sharma aka @dhanesh_sh
2) Saurabh Channe aka @SaurabhCh
3) Indra Kumar Reddy Madhuru aka @inder
Its a great way of building the network and sharing knowledge and experiences. Requesting others to join as well so that we can learn from your experience as well -
1) Kana Nguyen aka @KanaNg
2) Ankit Agarwal aka @ankitagarwal05
3) Najlaa Heerahaka @najlaah
4) Sanchari Das aka @SD_11
5) Sanjay RJ aka @SanjayR_
7) pranathi priya valapa aka @pranathipriya
Hi Bitun,
How much Time it takes to update the dataset if i ingest data using Streaming API. For me it took almost 30 mins to see data in Dataset.
is this expected ?
Do we have any SLA for Streaming data ingestion?
I see batch API updated immediately with in few seconds.
@Indra - I need to find out the Experience League document which talks about this. Usually, what I have seen is, if you are ingesting data through Streaming (e.g. HTTP API), it talks 15-20 minutes to be seen in datalake (using Query Service).
Sometimes, in very rare occasions, I had seen records getting available in datalake after 30 minutes.
Views
Replies
Total Likes
The delay you experienced when ingesting data using the Streaming API in Adobe Experience Platform (AEP) is not unusual. Typically, data ingested via Streaming API is processed in near real-time, but the visibility of this data in the data lake can take around 15-20 minutes. However, it can sometimes take up to 30 minutes under certain conditions.
This delay happens because while data is streamed into the Real-Time Customer Profile almost immediately, it is then batched and sent to the data lake every 15 minutes. Therefore, there is a slight delay before the data is available for querying or other processing within the data lake.
Regarding SLAs, Adobe doesn’t publicly document specific SLAs for streaming data ingestion times, but it is generally expected that streaming data should be processed quickly and be available within the timeframe you observed.
For more detailed information, you can refer to Adobe’s Experience League documentation on data ingestion processes (Experience League | Adobe) (Experience League | Adobe) (Experience League | Adobe).
Team,
Please try to answer these questions:
Views
Replies
Total Likes
Views
Replies
Total Likes
Team,
Please try to answer these questions. If you face any challenge in any of the topics we have covered, please feel free to reach out to me by any means.
Thanks,
Bitun
Describe the functionality of Backfill which comes out of the box for various types of data sources
Backfill determines what data is initially ingested. If backfill is enabled, all current files in the specified path will be ingested during the first scheduled ingestion. If backfill is disabled, only the files that are loaded in between the first run of ingestion and the start time will be ingested. Files loaded prior to the start time will not be ingested.
Interval and backfill are not visible during a one-time ingestion.
Views
Replies
Total Likes
1. Describe the functionality of Backfill which comes out of the box for various types of data sources.
Backfill refers to the process of processing historical data that was ingested into a data source prior to the initial setup of a dataflow. When Backfill is enabled, the system will automatically pick up and process all available data from the start, ensuring that the dataflow catches up with any historical data that may not have been processed during the regular data ingestion process.
2. What behavior will the data engineer observe - will it pickup all the files present in the S3 bucket? If so, then why?
Yes, the data engineer will observe that the dataflow picks up all the files present in the S3 bucket. This happens because Backfill is enabled, which means that when the dataflow is re-enabled after being disabled, it will process all the files that were added to the S3 bucket during the period it was disabled. The Backfill functionality ensures that no data is missed by processing all unprocessed files.
3. Which data prep function will be needed to convert this date to 'yyyy-MM-DDTHH:mm.SSSSS" format: dformat or format?
To convert the date from "MM-DD-yyyy HH24:mm" to 'yyyy-MM-DDTHH:mm.SSSSS" format, you would use the dformat function. The dformat function is used to convert date and time values from one format to another, whereas the format function is generally used for string formatting.
4. How can you do that using iif and decode functions?
To transform the values in the "TIER" field using iif and decode functions, you can use the following approach:
Alternatively, using the decode function, you could write:
This logic checks the value of the "TIER" field and transforms it accordingly.
To get the metadata of ingested data using batch, you should use the Catalog API. The Catalog API provides metadata and catalog information for datasets, including schema, lineage, and other relevant information.
Views
Likes
Replies
Views
Likes
Replies