Expand my Community achievements bar.

Need to delete a profile-enabled dataset to unstitch, but there are many AJO journeys with actions writing into it

Avatar

Level 7

I have an AJO custom action that writes data back to a dataset. The custom action is used by many different journeys, and we identified that we shouldn't have added an identity as a secondary identity which is causing inaccurate identity stitching. 

 

The only solution we have is to delete the dataset and recreate another one to unstitch everything, but that will impact all the journeys that are using the action. It is a very difficult task to to update all the journeys to use the new action, not to mention we need to go through all the draft/closed journeys that were using it too.

akwankl_0-1723101049496.png

Basically the connection is what I am trying to delete.

 

1) Is there another way to unstitch the connections between identities?

2) If i disable the dataset for profile, will it unstitch?

3) Is there an easier way to update all this without going through all the journeys to use a new action?

 

Thank you.

9 Replies

Avatar

Level 6

@akwankl Few insights I would to need help the situation.

 

  1. Does the custom action generate XDM payload or raw payload for ingestion? 
  2. What part of custom action makes you nervous to update to reflect changes?
  3. Volume of records effects?

~cheers,

Naresh Nakirikanti.

Avatar

Level 7

Hi @nnakirikanti 

 

1) Can you clarify on the difference? But I think it generates the XDM payload and ingest directly into a dataset that is profile-enabled and we would see the event in real-time.

2) My understanding is the only thing I can unstitch all the identities linking created from this dataset is to completely delete it. If I delete it, the current action would write to a deleted dataset, which would be useless. And I will need to create a new action (cause can't modify an live action that are being used by journeys), point to a new dataset, and update the 50+ journeys we have that are using the action.

3) around 500K+ records

 

Thanks.

Avatar

Level 6

@akwankl 

Here is the recommend approach to solve your case with persisting the historical data.

1) Writeup a query to have a derived dataset for identity links you want to delete, share will with Adobe support to remove those reference from identity service, I have done this with different clients for difference cases.

2) writeup a query on the source dataset to derive a new record set of records without the second identity to re-insert to the same dataset in datalake to reflect about the update,

 

I hope you have also establish insert/update audit date for profile dataset to consumer later based on partition by to retrieve latest profile record by date.

 

Let me know if you need more details.

 

~cheers,

Naresh Nakirikanti.

Avatar

Level 2

Key AEP concepts that you need to know:

 

1. When you enable a dataset for Profile, Profile Store and Identity Store (contains the identity graph) monitor for new batches of data from the moment you enabled it. If you disable it, it stops ingesting new data. 

 

2. Deleting a dataset will delete all data from the Profile Store and the identity graph. The delete in Profile Store will happen sooner than identity graph - currently we delete the links daily. 

 

3. Journeys are based on segments (batch and streaming): Qualification of profiles into audiences is at the time when the evaluation happened. Streaming segmentation evaluation is trigger-based and batch is nightly. So when you play around with datasets above, those evaluations will be impacted. This typically means that there could be fluctuations in qualifications depending on which users are coming and whether they are part of the existing graph collapse problem you are solving for or otherwise.

 

So how to clean up the identity graph:

1. Turn off all journeys that are segments and other attributes in the dataset of interest. 

2. Create a new empty dataset and mark it for Profile.

3. Use Data Distiller to create a new batch of data with your sanitized identity associations. Read these two articles I wrote that showcase:

1. Understand what your graph looks like first

https://data-distiller.all-stuff-data.com/unit-4-identity-graph/id-101-channel-identity-lookup-table...
2. Create a sanitized derived dataset (similar example)

https://data-distiller.all-stuff-data.com/unit-3-real-time-customer-profile/profile-102-data-enrichm...

 

As you master the new set of data with the right identity associations, Profile & identity graph will rehydrate. Turn the journeys back on and you should be good to go. 

 

Avatar

Level 7

Hi @samahapa,

 

I understand all that, and I know it's easy to just migrate the dataset to another dataset, and re-enable for profile. .

 

The issue is I have 50+ journeys that are using this action that are writing to the old dataset, by deleting the dataset, that means that action is meaningless/useless. I will need to create a new action which writes to a new dataset, AND update all 50+ journeys to use the new action, which points to the new dataset, and that's a big hassle. 

 

So I am wondering if there's a way to programmatically unlink the identities stitched from the old dataset as opposed to doing all the above.

 

Thanks.

 

Avatar

Level 3

hi akwankl,

 

Please try the below approach, this might help keep the existing custom action 

 

1) Is there another way to unstitch the connections between identities?

You can try deleting all batches in the dataset without deleting the dataset entirely.

 

2) If i disable the dataset for profile, will it unstitch?

No, it will not unstitch the existing profile fragments, but it will prevent the stitching the profile fragments post disabling the dataset for profile

 

3) Is there an easier way to update all this without going through all the journeys to use a new action?

Delete all the batches in the dataset, then go to schema remove the secondary identity. This will ensure that the custom action still points the existing dataset and the data would only stitch basis the primary identity

 

Warm Regards,

Reena John

Avatar

Level 7

Hi @ReenaJohn

 

Will deleting the previous batches unstitch the identity links created by the deleted batches? I have updated the schema to remove the secondary identity. 

 

Thanks.

Avatar

Level 7

What if there are no batch IDs? I think the batches were already deleted from our TTL.

akwankl_0-1723658157932.png

 

Avatar

Level 3

Hi @akwankl 

 

Yes you should be able unstitch the IDs by deleting the batches. Please check for the batch IDs directly on the dataset. If available there you can use that I'd to delete the batches.

 

Warm regards,

Reena John