Expand my Community achievements bar.

SOLVED

Incremental load data discrepancy in New Profile Fragments and Existing Profile Fragments.

Avatar

Level 3

Hi team,
Recently I did the batch data ingestion with enabling the incremental load. Once the incremental load work as on daily basis for each batch ID against the records get ingested into the dataset but all records only reflect under the New Profile Fragments as new records and Zero records reflect in Existing Profile Fragments as existing record or modified existing record. (Specific to XDM-ExperienceEvent Class Dataset)
As per source daily new updated records count is low but in AEP getting the different count what could be the reason.
All mapping is correct.

Below snaps for your reference.

sandip_surse_1-1682494486028.png

 

Regards,

Sandip Surse

 

1 Accepted Solution

Avatar

Correct answer by
Level 5

aha there you go, if you are bringing in only incremental data then your updated existing profile fragments will be always zero becoz as you are ingesting deltas, if you want to update existing data then you need offload the uuid generation at source level and persist as attribute/field at record level, this way when you load historical data with updates then system not gonna treat they are new... and you will get to see that updated profile fragments count..

 

Test this with simple file of ten records with uuid mocked and ingest, do mock file with five records of old (with old uuid's) and 5 new records, now you will see the numbers showing up for existing profile count as 5 and new profile fragments as 5.

 

Let me know if you have more concerns.

 

 

View solution in original post

5 Replies

Avatar

Community Advisor

Hi @sandip_surse - Which field have you selected for incremental load? As per my understanding while data ingestion AEP will check if the timestamp of this field is newer than the timestamp of last batch ingestion. If it is new then it will ingest this record else ignore this record.

 

Can you check the values of this field?

Avatar

Level 5

@sandip_surse Few questions in here,

1) Did you use the batch ingestion api or batch source connector? If it is source connector what type is it?

2) How you are handling incremental data consideration column vs file generation?

3) is the source generation the required _id (uuid ot guid :unique string identifier for the event) for experience event or you are doing the with data prep?

 

I have faced this issue and want to recommend based on above clarifications....

Avatar

Level 3

Hi nnakirikanti,


Please find the answers here as you requested.

1. We use snowflake source connector for batch ingestion.
2. For incremental data we use timestamp column.
3. We are doing the data preparation during the mapping for '_id' field and map with uuid() function.

Avatar

Correct answer by
Level 5

aha there you go, if you are bringing in only incremental data then your updated existing profile fragments will be always zero becoz as you are ingesting deltas, if you want to update existing data then you need offload the uuid generation at source level and persist as attribute/field at record level, this way when you load historical data with updates then system not gonna treat they are new... and you will get to see that updated profile fragments count..

 

Test this with simple file of ten records with uuid mocked and ingest, do mock file with five records of old (with old uuid's) and 5 new records, now you will see the numbers showing up for existing profile count as 5 and new profile fragments as 5.

 

Let me know if you have more concerns.

 

 

Avatar

Level 3

Hi arpang16406580,

We are using the 'timestamp' field for the incremental load, but if we checking todays new records as per timestamp at source end it's minimum count but in AEP as we schedule the incremental load every time FULL LOAD is happening.
Might be the possibility apart from the timestamp field any other field on daily basis updated and due to this thing is happens.