Hello,
I have setup an HTTP API Streaming Connector in the experience platform. I noticed following when ingesting the data:
1. The dataflow starts the processing every hour.
2. I see the lag of >15 min when the data actually shows up as processing in the dataflow execution.
3. The records then take close to 10 min to appear in datasets.
I am using the /collection/batch API to ingest the data, with the number of records being very low. Is this usual for the data to take close to 25-30 min to land even with the streaming ingestion? If yes, is there a faster way to ingest the data?
Thanks
Solved! Go to Solution.
Views
Replies
Total Likes
Correct. Use the Profile Viewer UI or Profile API if you want more real time checks.
Views
Replies
Total Likes
Using the streaming ingestion API, data is first ingested in the Real time customer profile database and after that in subsequent batches it will move to the data lake which have a latency of less than 60 mins. This is the usual flow of AEP pipeline. Once the streaming ingestion is successfully done, you will be able to see the attributes or event data in the real time customer profile.
Destination | Expected latency |
---|---|
Real-Time Customer Profile | < 1 minute |
Data Lake | < 60 minutes |
Here is the architecture of AEP that would help you to understand the data flow:-
Hope this helps.
As @Avinash_Gupta_ mentioned, there are a few data stores in AEP. Along with that the metrics you see in the UI are updated periodically, while the Profile itself is updated in more real time when sending in from a streaming source.
When trying to validate if a Profile or Event has landed into the Profile store in more real time I would suggest looking up that profile using the Profile viewer UI or using the Profile API.
Views
Replies
Total Likes
Thanks @Danny-Miller & @Avinash_Gupta_ ,
What I understood is that the latency is actually in the UI to lead the data but the ingestion is as per the SLAs mentioned in the document?
After calling the streaming API, I usually keep refreshing the source dataflow page to see if the dataflow run has started or not (if not already processing), and if it has, whether the records are ingested or not.
But what I keep seeing is that the dataflow takes a long time to start processing, then for a brief time, it only updates the number of records received, and at the very end after 30 min or so it updates the number of records ingested or failed.
But, it seems that what you are suggesting is that I shouldn't rely on that page and rather make a profile API call or search it in profile UI. Is this understanding correct?
Regards
Abhishek
Views
Replies
Total Likes
Correct. Use the Profile Viewer UI or Profile API if you want more real time checks.
Views
Replies
Total Likes
Just verified, and it's almost instantaneous. Superb!
Thanks a lot!
Views
Replies
Total Likes