Bulk Lead Extract API questions | Community
Skip to main content
Kurt_Koller
Level 4
July 18, 2017
Solved

Bulk Lead Extract API questions

  • July 18, 2017
  • 1 reply
  • 10244 views

Working with the Bulk API, have a few questions about it.

1. What's the easiest way to just get all leads for an initial sync via the bulk API? You need to have a filter. Say I have an instance with 150k leads created over 3 years. Do I need to do 30 day bulk API requests starting at some arbitrary old date to get all leads? This needs to be done through the API. It seems that in this scenario using the non-bulk Rest API with a paging token would be faster (see #3) even though it would result in more calls. This is what we are currently doing but we'd hoped to replace it with the Bulk API. (If this is how this must be would LOVE to see the 30 limit upped to 31 days so we can logically pull by calendar month)

2. (minor) Why does the Bulk API return the word "null" in a missing filed in a CSV? In a CSV a missing/null value is just missing, not the word "null". Here's an example:

id,email,company,sfdctype,postalcode,inferredpostalcode,inferredmetropolitanarea,country,inferredcountry,persontype

408,me@example.com,MyCompany,Contact,null,null,null,United States,null,contact

In every other CSV scenario I've worked in, this would be represented as:

id,email,company,sfdctype,postalcode,inferredpostalcode,inferredmetropolitanarea,country,inferredcountry,persontype

408,me@example.com,MyCompany,Contact,,,,United States,,contact

3. Why does the startedAt date change/why the time weirdness here? I created a lead export of 1 month of leads filtered by createdAt. Here are the stages:

Create:

{

   "fields": ["id", "email", "company", "createdAt", "updatedAt","unsubscribed", "emailInvalid", "originalSourceType", "sfdcType","postalCode", "inferredPostalCode",

"inferredMetropolitanArea","country", "inferredCountry", "personType"],

   "format": "CSV",  "filter": {  "createdAt": {

         "startAt": "2017-06-01T00:00:00Z",  "endAt": "2017-07-01T00:00:00Z"  }   }

}

Response:

{

    "requestId": "bcfe#15d53739e6b",

    "result": [

        {

            "exportId": "b70642a3-8a89-4bf1-a441-24b214096b78",

            "format": "CSV",

            "status": "Created",

            "createdAt": "2017-07-18T02:07:52Z"

        }

    ],

    "success": true

}

Status after enqueue:

{

    "requestId": "1796#15d53747d57",

    "result": [

        {

            "exportId": "b70642a3-8a89-4bf1-a441-24b214096b78",

            "format": "CSV",

            "status": "Queued",

            "createdAt": "2017-07-18T02:07:52Z",

            "queuedAt": "2017-07-18T02:08:49Z"

        }

    ],

    "success": true

}

Status after change from "Queued" to "Processing":

{

    "requestId": "16721#15d5374f1fe",

    "result": [

        {

            "exportId": "b70642a3-8a89-4bf1-a441-24b214096b78",

            "format": "CSV",

            "status": "Processing",

            "createdAt": "2017-07-18T02:07:52Z",

            "queuedAt": "2017-07-18T02:08:49Z",

            "startedAt": "2017-07-18T02:09:00Z"

        }

    ],

    "success": true

}

Status after complete:

{

    "requestId": "17c12#15d53bbac87",

    "result": [

        {

            "exportId": "b70642a3-8a89-4bf1-a441-24b214096b78",

            "format": "CSV",

            "status": "Completed",

            "createdAt": "2017-07-18T02:07:52Z",

            "queuedAt": "2017-07-18T02:08:49Z",

            "startedAt": "2017-07-18T03:26:09Z",

            "finishedAt": "2017-07-18T03:26:25Z",

            "numberOfRecords": 43,

            "fileSize": 7389

        }

    ],

    "success": true

}

Why does the startedAt date change? I polled status.json every minute or so and it was the same as the status above until it finished at which time the startedAt jumped forward.

3a. Also under what circumstances would an export of 1 month of lead records out of an instance that has 1600 leads take more than 1 hour 15 minutes? The final result looks like it took 16 seconds but it entered processing way before that. There is no other API usage, no other Bulk API jobs are queued, etc. I've repeated this 3 times in a row.

4. There's a note in the API that the filter type updatedAt*
* Filter type is unavailable for some subscriptions.  Marketo Support can provide you with this information

How can we find out via the API whether this is available or not? Just call it and process a filter error? Or will it just return an empty set? I need to know what to do when we run against an instance and we have no idea whether they have this feature or not, how would we find/extract leads that are updated since a date through the bulk API? So far I haven't seen this on instances I've tested, but I'd like to know what to do.

Any insight is appreciated.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by everly

Yes, 31 day date range is now supported.

1 reply

Adobe Employee
August 1, 2017

1. Correct, using bulk you would have to pull leads in 30 day increments.  I have created an enhancment request to increase the date range to 31 days.

2. This design choice is to maintain consistency with bulk lead export from within the Marketo UI.  i.e. a precedent had been set

3. I am unable to reproduce this behavior, I recommend filing a support case with Marketo.

3a. There are many variables that impact performance in a multi-tenant system.  You might try enqueuing the job during off hours.

4. You can determine if the instance supports updatedAt by calling Create Export Lead Job endpoint.  If not supported, tou will receive an error 1035, "Unsupported filter type for target subscription".

Kurt_Koller
Level 4
August 9, 2017

@David Everly​ thanks for the response. With #2, I just pulled an export of leads from the UI with the same fields, and none of them contain the word null in them. Sample line from the CSV for my account in our system, note the ,,, and not ,null,null,null:

1013663,kk@digitalpi.com,Digital Pi,2016-04-26 15:13:15,2017-07-27 23:44:17,,,1,Omega,List import

Adobe Employee
August 9, 2017

Slight correction to #2, design decision to be consistent with REST API.  For example, a call to Get Multiple Leads by Filter Type returns null for non-existent lead fields.  Here is an example with non-existent field "leadRole" for a given lead id "318581":
GET /rest/v1/leads.json?filterType=id&filterValues=318581&fields=leadRole

{

    "requestId":"15426#15dc7cc2251",

    "result":[

        {

            "id":318581,

            "leadRole":null

        }

    ],

    "success":true

}