Building a data pipeline for Snowflake | Community
Skip to main content
Level 2
September 16, 2020
Solved

Building a data pipeline for Snowflake

  • September 16, 2020
  • 1 reply
  • 3404 views

We are trying to build a data pipeline using Marketo REST API, AWS lambda and Snowflake. We exceeded the API limits recently (API limit was 10K / day... now it is changed to 50K), and we want to avoid such problems in the future.

 

I am trying to understand the API query limits better. I found this blog post explaining the limits, but I am hoping to get some more details. 

  1. If we use the REST API with dynamic date range for lead activities data (for 10 activity type ids) with next page token, how much data(in MBs) can be consumed for a date range of one week and how many API calls could one request / response consume?
    • Up to 200k+ / month of lead activity data (approximately)
  2. How many api requests would we end up consuming if we download all leads from the marketo leads database?
    • oldest record is from September of 2019
    • Bulk export for a 31-day period would result in 3k - 5k records (approximately)

Thank you in advance!

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by SanfordWhiteman

Why do you need more than 31 days in a single job? 

 

Why can't you use the Bulk Extract for Leads, exactly?

 

The Bulk Extract is, in practice, metered only by bytes (500 MB/day). The number of API calls to queue up extracts is so small as to negligible (if you're that close to running out of calls, you're in trouble big).

 

1 reply

SanfordWhiteman
Level 10
September 16, 2020

You really should be using the Bulk Extract API for this, not the paginated export. 

 

As far as the data transfer in bytes, it's impossible to guess. You have to baseline it yourself with historical (i.e. current) data, and make sure you have say 20% headroom.

kdutta-2Author
Level 2
September 16, 2020

Thank you for such a quick response!!! 👍

I had to use the paginated export method because I wanted to download data beyond the 31-day limit. In a bulk export, I did not know how the output will be since each activity type ID will have a different set of 'attributes'. I will have to test the bulk extract API approach for lead activities as well. And for the leads data, I have no choice but to use the bulk export and the output is much simpler too, so its working out for me.

 

I am a bit confused about how marketo api measures and if byte size takes precedence over #of API calls... Thank you!

SanfordWhiteman
SanfordWhitemanAccepted solution
Level 10
September 17, 2020

Why do you need more than 31 days in a single job? 

 

Why can't you use the Bulk Extract for Leads, exactly?

 

The Bulk Extract is, in practice, metered only by bytes (500 MB/day). The number of API calls to queue up extracts is so small as to negligible (if you're that close to running out of calls, you're in trouble big).