Expand my Community achievements bar.

Webinar: Adobe Customer Journey Analytics Product Innovations: A Quarterly Overview. Come learn for the Adobe Analytics Product team who will be covering AJO reporting, Graph-based Stitching, guided analysis for CJA, and more!
SOLVED

Different amount of record sets in DataFeeds for same time range

Avatar

Level 1

Hello together, 


my company uses the DataFeed Export via SFTP to load the tracking data into our Data Warehouse. 

I recently adjusted the data feed (added one more eVar to export) and started an export of a historical timerange. When we loaded the data into a table and compared the amount of records from the historical export to the timerange we imported via daily export we had a difference about nearly 1.000.000 hits for a time range of nearly one month ( 89.651.013 hits loaded via daily export for that time range vs. 90.596.309 hits in the manual historical export).

I would expect to receive the same amount of hits for the same time range. 

Do you have any idea, why the amount of hits differ? 

Best regards,
Sebastian

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

For your normal Data Feeds, did you add a delay to them to allow the final data to be processed and included in your feeds?

 

Our company has used Data Feeds on an hourly schedule for many years... when I took over from the previous analytics team and was reviewing the feeds with the team, I noticed they had no delay.. so they were missing data at the end of every hour (particularly mobile app data which needs to be processed).

 

We rebuilt all our feeds and started using the max delay to ensure that all data would be processed before the feed files were sent.

 

Whether you are processing hourly or daily, over time, a lot of data could be lost that comes in right before the feed is sent out... that might be what you are experiencing?

View solution in original post

3 Replies

Avatar

Correct answer by
Community Advisor

For your normal Data Feeds, did you add a delay to them to allow the final data to be processed and included in your feeds?

 

Our company has used Data Feeds on an hourly schedule for many years... when I took over from the previous analytics team and was reviewing the feeds with the team, I noticed they had no delay.. so they were missing data at the end of every hour (particularly mobile app data which needs to be processed).

 

We rebuilt all our feeds and started using the max delay to ensure that all data would be processed before the feed files were sent.

 

Whether you are processing hourly or daily, over time, a lot of data could be lost that comes in right before the feed is sent out... that might be what you are experiencing?

Avatar

Level 1

Hi Jennifer,

hanks for your quick response. 

We just export daily and don't use a delay for the export.

I just checked for some examples if the "more" hits are around a DATE_TIME value between 0 and 2 am or 22pm and 0am. 
Sure, there are existing some, but we also face hits from the historical export with a DATE_TIME value of 6:15 am for instance.

 

This shouldn't occur even if we don't use a delay, or am i wrong?

Avatar

Community Advisor

No, 6:15 sounds a little odd.... 

 

I don't suppose you changes your internal IP Filters / Custom Bot rules... that could change how historical data is processed....

 

Also, do you have "offline" mobile app tracking? If you do, tracking from users on their apps when they have no internet will be stored on the device and sent as a bundle once internet is connected. That data will still be timestamp for when it occurred, so hypothetically:


June 1

1. User on Mobile App (no internet - collecting data in queue)

2. Data is exported just after midnight for June 1

June 2

3. User opens Mobile App (internet enabled - bundled and recent data will be sent to suite, the queued data will be added to June 1

4. Data is exported just after midnight for June 2 (this isn't really a "delta" feed, that sends all new data since the last pull, but it's time-blocked to June 2 - so the queued data wouldn't be included)

 

 

Now, this Mobile App example only applies if your mobile app is configured for offline... we don't have this, so we don't have to deal with this (our app doesn't work without internet so it's a moot point for us... but it's a real possibility for 

 some setups and this might be impacting you?