Hello,
My question is in relation to the "profile richness" license metric, but not related to ExperienceEvents. I have set up a daily CSV ingestion from an SFTP source which maps into an IndividualProfile dataset which is enabled for profile. Each day an entire file is ingested, with many of the records in the file having no changes from the previous day. These are essentially duplicate profile fragments. Do these impact the "profile richness" license metric?
For example if I upload a duplicate IndividualProfile record for a specific profile every single day for an entire year, would the "profile richness" metric start to get larger and larger because there are so many profile fragments even though they are containing the exact same values and so the "unified" profile itself remains static this entire time?
I understand that with ExperienceEvents each one has a timestamp and each dataset can enforce a TTL. But this is not the case with IndividualProfile datasets and so I want to ensure that I am not increasing the "profile richness" license metric with these duplicate profile fragments going into the data lake enabled for profile.
Solved! Go to Solution.
Topics help categorize Community content and increase your ability to discover relevant content.
Hi @PeteToasty - If you talk about data in profile store, the latest data will be referred. Which means the latest data which was inserted in the profile will be the considered as part of profile.
Hello @PeteToasty
According to this documentation, It will impact the profile richness and the license usage.
Thanks! Say if I had a dataset which had 100M records of individual profile records and I wanted to clean it up and remove any records ingested last year, and keep records ingested this year, is there any way to do it? I can't delete batches in a record schema dataset so it seems like I would be stuck with those 100M records forever and my profile richness license metric would only ever increase except if I delete the entire dataset?
It appears as though there are ways for managing ExperienceEvent datasets, such as TTL for Unified Profile Store and deletion of batches with query service, but no such tools for record schema datasets?
Hi @PeteToasty - Yes , if the same data is inserted again via CSV it will impact the profile richness license metric. If you will query the dataset , you will also see multiple same entries for that profile in the dataset.
One better approach could be to only insert the data in the csv which has updated or newly added.
Thanks,
Arpan
Thanks! Yes that is the case when I query the dataset. The dataset definitely has duplicate records but I didn't know if that translated to duplicate profile fragments in the Unified Profile Store.
I was considering cleaning old batches from this dataset but then ran into this on the documentation.
That made me question record type datasets going into the Unified Profile Store and if they are 'overwriting' then would each duplicate record just simply overwrite the last one, not causing an issue in the Unified Profile Store? (the only issue would be to total storage in data lake which would continue to increase)
Hi @PeteToasty - If you talk about data in profile store, the latest data will be referred. Which means the latest data which was inserted in the profile will be the considered as part of profile.