Author: @Saswata Ghosh
Editor: @Danny-Miller
About Pseudonymous Profile Expiry
RTCDP’s May Release saw the GA (Generally Availability) of Pseudonymous profile data expiry which is complementary to the granular ‘dataset-level’ event expiration strategy discussed in our earlier blog. This feature, once enabled, will remove stale pseudonymous profiles from your Experience Platform instance daily.
A pseudonymous profile does not have a Identity like email ID or CRM ID and has just pseudonymous identities like ECID (Experience Cloud ID), AAID, GAID (Google Ad ID) etc. (as specified by the customer).
Targeting anonymous visitors is useful for web-personalization use cases for a certain period. All customers agree that in general, the older the data, the less business value it has. That threshold usually defines a line when its value is less than the cost of keeping the data. Beyond this useful period, it is ideal if they get purged by assigning a profile-level TTL (time-to-live) or ‘pseudonymous expiry duration’ based on last activity.
Real World Customer Example
Background
This blog provides a walkthrough for using pseudonymous expiration with the example of a reputed banking customer in Europe with an Adobe Analytics + RTCDP implementation.
After 6 months of using the Platform extensively, they requested investigation into the sharp increase in their addressable audience, which was many times over their yearly license entitlement (for both regions combined in the same org).
Analysis
What they found was that 90% of their profile counts were constituted of ECID/AAID cookie-based pseudonymous profiles.
They encountered this issue across both their region-specific Production sandboxes.
| Count of Profiles | Region 1 | Region 2 |
1. | Total Profiles when license was crossed | 25 million | 90 million |
| '15-day Active’ AAID/ECID Pseudonymous Profiles | 1 million | 6 million |
| 'Lifetime’ known profiles with CRM ID (Single or with ECID/AAID) | 1.5 million | 5.5 million |
2. | ‘Inactive’ Pseudonymous profiles expired | -22.5 million | -78.5 million |
3. | Resulting Profiles after Expiration | 2.5 million | 11.5 million |
They also evaluated:
- How did unauthenticated traffic fit into their business model?
- The percentage of unauthenticated to authenticated visitors they had?
- Their goals with unauthenticated traffic?
- At what point does the value of this data dip below the cost of maintaining it?
Their business model was focused on authenticated traffic and targeting/personalizing to their customers. Older data from unauthenticated traffic had little value to them after 2 weeks in case they do not register to their website for more information or services. Due to this, they decided to reduce the number of events stored for unauthenticated traffic by expiring older unauthenticated profiles (beyond 15 days) while preserving the valuable events from their known customers over a longer period (365 days).
Implementation
Based on their analysis, they decided to implement a 15-day expiration of all pseudonymous profiles (having just ECID and AAID) for both regions (sandboxes).
Earlier, they had already requested Engineering team to apply a 365-day Experience Event expiration on their Adobe Analytics dataset which removes older events for all profiles (customer and pseudonymous events).
Post Implementation
Post implementation of Pseudonymous profile expiration on both production sandboxes, it was seen that total addressable audience dropped by 90% from 129 million to 14 million!!
Moreover, the profile count has been steady ever-since reflecting that the expiry job is purging pseudonymous profiles successfully daily.
What does a Pseudonymous Profile look like? Below is a single Profile that has three characteristics that we are looking for:
- No non-cookie Identity namespace like Email/CRM_ID etc, just the ones specified in the request like ECID (Experience Cloud ID), AAID etc. This Profile has only identities from a cookie namespace (ECID).
- No profile attribute update in last 15 days. This snapshot was taken on June 16th, 2023.
- Last event was 15 days ago.
Capturing ‘Last Activity Date’ in Profile Dataset (optional)
This configuration is optional if you are sure that there is no profile attribute for your anonymous users, for example- consent.
- The external audit field will not be used anywhere by the Pseudonymous Profile Expiry job (since it references a hidden system date field), but this dedicated audit attribute helps in two-fold validation (it is recommended to add the Out-Of-The-Box Field group ”External Source System Audit Details”. A custom attribute can also serve the same purpose.):
- From the profile lookup as we saw in the earlier section
- Approximation of pseudonymous profiles using test Segment as we will see in next section.
- ‘Last Activity Date’ field can be populated with the calculated value of the Data-Prep function now() to get the ingestion timestamp in the Unified Profile.
Assumption: Timestamp-ordered merge policy is applied to all contributing profile datasets.
Pseudonymous Profile Pre-Assessment & Validation
How do you evaluate your own data to estimate the impact of what this might look like in your sandbox?
- Add the widget “Single Identity Profiles by Identity” to Profile Dashboard
- As a starting point, these graphs can provide clues on which namespace the profile count is coming from (e.g., cookie-based identity namespace like ECID/AAID or CRM (profileID) from known users). However, the complete metrics of Identity Overlap are best obtained via API as discussed in the next step.
- The Identity Namespace Overlap Report from API also provides detailed visibility into the composition of your organization’s Profile Store by exposing the identity namespaces that contribute most to your addressable audience (merged profiles).
Generate the identity namespace overlap report| (Profile Preview) API Endpoint
A successful request returns HTTP Status 200 (OK) and the identity namespace overlap report which has the profile counts against individual IDs and their various combinations.
{
"data": {
"AAID,ECID": 500000,
"ECID,crmid": 2000,
"AAID,ECID,crmid": 7000,
"AAID": 100000,
"crmid": 1000,
"ECID": 400000
},
"reportTimestamp": "2023-06-16T16:55:03.624"
}
The above report shows that there are about 10,000 known profiles (i.e., profiles with some kind of authenticated identity, like crmid
"data": {
"ECID,crmid": 2000,
"AAID,ECID,crmid": 7000,
"crmid": 1000,
},
However, there are around 1 million pseudonymous profiles (ECID, AAID, ECID+AAID
"data": {
"AAID,ECID": 500000,
"AAID": 100000,
"ECID": 400000
},
These counts represent the number of profiles that do not have a known/authenticated Identity like email ID or CRM ID.
The above graphs and reports do not provide guidance on the recommended pseudonymousTTLConfig to be requested for the sandbox.
You can therefore try to estimate the counts of the cleanup results for different TTL duration by creating test segments as shown below.
Audience E.g. “Test Pseudonymous profiles >=15days”
This one audience needs to apply rules on both ‘Profile Attributes’ and ‘Experience Events’ to gauge “Pseudonymous Profiles” as per the 3 criteria mentioned earlier.
Disclaimer: This segment validation is just for assessment before the more-dependable dry-run results are obtained from Engineering via Support request.
Profile Attributes should not have non-cookie IDs (e.g. CRM_ID) or any update in last N days (optional criteria, only needed if populating Profile attributes for non-Authenticated identities like consent)
Experience Events should EXCLUDE all profiles having any event with a non-cookie ID or any new event in last N days
Assumption for Events: If the dataset has been enabled for profile using the schema’s identitymap, then the below rules can be used, else refer next section for cases where identitymap is not used.
For any other Web Events feed if the known ID (e.g. CRM_ID) is expected in an Evar which is marked as an identity, then it must be specified in the Exclude criteria.
Support Request for Pseudonymous Profile Expiration
Pseudonymous Profile data expiration cannot be configured through the Platform UI or APIs. Instead, you must contact support to enable this feature.
E.g.
Other Considerations
- Pseudonymous profile data expiry can run with different configuration on different production sandboxes under the same org.
- Once you have activated this feature, the deletion of profiles is permanent. There is no way to roll back or restore the deleted profiles.
- This is not a one-time cleanup job. Pseudonymous profile data expiry will continually run once per day.
- It drops entire profiles that match the customer’s Identity input and inactivity criteria irrespective of whether it is an attribute-only profile or event-only profile or both.
- “Pseudonymous Profiles Data Expiration” removes profiles permanently only from the Unified Profile Store (UPS) and neither the data lake nor the Identity graph (UIS).
- Please expect 1-2 weeks’ lead time for the dry-run results to review with your customer.
- Similar additional lead time should be expected for actual implementation once you confirm the TTL duration for sandbox after reviewing dry-run results for different TTL periods.
Related Links:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.