Ingestion of success events into data warehouse via data feed | Community
Skip to main content
Level 2
September 10, 2024
Question

Ingestion of success events into data warehouse via data feed

  • September 10, 2024
  • 1 reply
  • 1311 views

Hi Team,

 

We are trying to ingest success events into data warehouse via data feed and have some questions for the dev team. Any proper documentation on the format of the columns (event_list) available in data feed, mapping file (event.tsv), and how the mapping (between event_list and event.tsv) should be done to display each of the events as separate columns in the data warehouse table. Any sample query for this mapping?

 

Thank you!

 

 

 

 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

1 reply

Jennifer_Dungan
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
September 11, 2024

Hi @debanjanabh1,

 

event_list and post_events_list are comma separated list of events, but as you likely noticed, you won't see "event1,event2,etc" the way you see when sending the data to Adobe...

 

Instead you will see just a number, but even that doesn't correlate directly... as in "event1" is not "1".

 

This is because there are a lot of "standard" events, and all of them have to be accounted for along with your custom events. Events.tsv will provide that mapping.

 

Now, really, the mappings don't change from one feed to another... Unless Adobe adds some new events, the mapping should remain constant... it doesn't even reflect what you are using.. so if you add a new custom event tomorrow, the mapping file you already have should tell you what to expect (it may reflect your contract level though)

 

Sample Mapping File:

 

 

You will also notice that the numbers aren't fully sequential... This is so Adobe can keep like items grouped together... Instance Events are in the 100s and 10000s; custom events are in the 200s and 20100s, etc.

 

 

While I am not a DB / SQL expert, I would think that trying to split the events into columns may not be the most efficient way to process the data.. I would actually be tempted to have an events lookup table (with all the event mappings from your events.tsv), and an events table that has a one to many relationship with your main table... 

 

So let's say you have a hit on row 163763, and it has 3 events: 204 (event5), 219 (event20) and 1 (purchase).

 

You would add each of those 3 events in their own row in the events table... with a foreign key referencing 163763 (the one), each of the events (204, 219 and 1) have their own row (the many), and are also a foreign key to the lookup table so you can see what event it is.

 

You should then be able to query the three tables using joins.

 

 

I don't directly work with our data lake, so I don't have any sample queries handy... and while I could try to come up with something on the spot, I wouldn't be able to validate it... 

Level 2
September 11, 2024

Hi @jennifer_dungan Thank you. Can you tell me please if we should ingest event_list or post_event_list?

Our dev team will load the event_list or post_event_list (whichever we decide to go with) as-is in one table (which has all our other data like props and evars) and the event.tsv in another table. Then do we need to join these two tables as you mentioned with the visit_id or visitor_id?

Jennifer_Dungan
Community Advisor and Adobe Champion
Community Advisor and Adobe Champion
September 11, 2024

Post Event would be better.. any events being set by processing rules won't exist in "events_list" but will in "post_events_list"... event if you don't have the need now, if some processing gets added in the future then you won't have to change anything.

 

So if you are going to create a new column for every event, you realize that could be up to around 1300 columns, right? Also, I really don't know what your column would hold? "1" for standard counters, the number or currency passed for other types of events? (I expect those show something like 255=5 or 256=9.99, but again, I don't work as much with that side of things...)

 

As for mapping you wouldn't use either visit id or visitor id....

 

Use my same example above, if you have post_event_list "1,204,219"... you would have to look up "1" in the lookup table to know it's "purchase" and you would have to look up "204" in the lookup table to know it's event5, and then look up "219" to know it's event20, and then apply the value to the proper column....