Expand my Community achievements bar.

Webinar: Adobe Customer Journey Analytics Product Innovations: A Quarterly Overview. Come learn for the Adobe Analytics Product team who will be covering AJO reporting, Graph-based Stitching, guided analysis for CJA, and more!

Ingestion of success events into data warehouse via data feed

Avatar

Level 2

Hi Team,

 

We are trying to ingest success events into data warehouse via data feed and have some questions for the dev team. Any proper documentation on the format of the columns (event_list) available in data feed, mapping file (event.tsv), and how the mapping (between event_list and event.tsv) should be done to display each of the events as separate columns in the data warehouse table. Any sample query for this mapping?

 

Thank you!

 

 

 

 

6 Replies

Avatar

Community Advisor

Hi @DebanjanaBh1,

 

event_list and post_events_list are comma separated list of events, but as you likely noticed, you won't see "event1,event2,etc" the way you see when sending the data to Adobe...

 

Instead you will see just a number, but even that doesn't correlate directly... as in "event1" is not "1".

 

This is because there are a lot of "standard" events, and all of them have to be accounted for along with your custom events. Events.tsv will provide that mapping.

 

Now, really, the mappings don't change from one feed to another... Unless Adobe adds some new events, the mapping should remain constant... it doesn't even reflect what you are using.. so if you add a new custom event tomorrow, the mapping file you already have should tell you what to expect (it may reflect your contract level though)

 

Sample Mapping File:

Jennifer_Dungan_0-1726080362287.png

 

 

You will also notice that the numbers aren't fully sequential... This is so Adobe can keep like items grouped together... Instance Events are in the 100s and 10000s; custom events are in the 200s and 20100s, etc.

 

 

While I am not a DB / SQL expert, I would think that trying to split the events into columns may not be the most efficient way to process the data.. I would actually be tempted to have an events lookup table (with all the event mappings from your events.tsv), and an events table that has a one to many relationship with your main table... 

 

So let's say you have a hit on row 163763, and it has 3 events: 204 (event5), 219 (event20) and 1 (purchase).

 

You would add each of those 3 events in their own row in the events table... with a foreign key referencing 163763 (the one), each of the events (204, 219 and 1) have their own row (the many), and are also a foreign key to the lookup table so you can see what event it is.

 

You should then be able to query the three tables using joins.

 

 

I don't directly work with our data lake, so I don't have any sample queries handy... and while I could try to come up with something on the spot, I wouldn't be able to validate it... 

Avatar

Level 2

Hi @Jennifer_Dungan Thank you. Can you tell me please if we should ingest event_list or post_event_list?

Our dev team will load the event_list or post_event_list (whichever we decide to go with) as-is in one table (which has all our other data like props and evars) and the event.tsv in another table. Then do we need to join these two tables as you mentioned with the visit_id or visitor_id?

Avatar

Community Advisor

Post Event would be better.. any events being set by processing rules won't exist in "events_list" but will in "post_events_list"... event if you don't have the need now, if some processing gets added in the future then you won't have to change anything.

 

So if you are going to create a new column for every event, you realize that could be up to around 1300 columns, right? Also, I really don't know what your column would hold? "1" for standard counters, the number or currency passed for other types of events? (I expect those show something like 255=5 or 256=9.99, but again, I don't work as much with that side of things...)

 

As for mapping you wouldn't use either visit id or visitor id....

 

Use my same example above, if you have post_event_list "1,204,219"... you would have to look up "1" in the lookup table to know it's "purchase" and you would have to look up "204" in the lookup table to know it's event5, and then look up "219" to know it's event20, and then apply the value to the proper column....

Avatar

Level 2

Thanks @Jennifer_Dungan . I am looking to get the events and its values in this format. My events are mostly counters.

Event X - 5

Event Y - 200

How can I get to display in this format?

Avatar

Community Advisor

Are events 5 and 200 Numeric incrementors that are counting 5 or 200 on the same hit?

 

Because if these are counters, they will only count once per hit, so there is no way these should be stored in the DB aggregated like this..

 

Or are you asking how to pull the event data in a query to get the values from the DB? Your original question sounded like you were asking how to store the data into your warehouse...

 

Now, like I said, I am not a DBA and I don't work in SQL very often, that is done by our Data Engineering Team... but I found this on Stack Overflow:

 

https://stackoverflow.com/questions/27165243/get-count-of-different-values-in-comma-separated-row-in...

 

You would have to combine this with looking up the Event ID to map to the Event Name, and whatever other query you need (whether you want to pull by a visit, or a visitor, or just within a time frame, etc.

Avatar

Level 2

@Jennifer_Dungan Yes first part of the question was how to store the events in data lake and second part of the data is how to display that in a tabular format similar to how we see in workspace.