Expand my Community achievements bar.

SOLVED

Handling over 2000 bundles in Workfront 'Search' module?

Avatar

Level 8

I would like to seek advice from the community on how to proceed when we are expecting more than 2000 bundles in the Workfront Search module, considering that the Search module has a limit of only 2000 bundles. Any recommendations or solutions would be greatly appreciated. Thank you!

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Yes, here is my SOP process.

 

1. Make a "count" call, where the query matches your final query

2. Create a variable, I call it maxResults usually

3. Place a repeater module that starts at 0, in the repeats put ciel({count}/{maxResults}), and then in steps, put maxResults

4. Place your "real" query, mapping i from the repeater into $$FIRST and maxResults into $$LIMIT

 

Note, I did all of that from memory so it may not be 100% on verbiage but the process is correct, if you can't figure it out let me know and I'll pull out my notes.

 

Edit: Corrected mistake in reply

View solution in original post

12 Replies

Avatar

Community Advisor

This is called pagination. Essentially, you have to specify how many records you want, and where the first item you want returned in the total list of returns is, by including $$FIRST={the first record you want from the total list) and $$LIMIT, which dictates how many results you want in your response. So if you did a search that would wanted to return 3000 results, you could make one call including $$FIRST=1 and $$LIMIT=2000, and then a follow up call including $$FIRST=2001 and $$LIMIT=2000.

 

https://experienceleague.adobe.com/docs/workfront/using/adobe-workfront-api/api-general-information/...

Avatar

Community Advisor

As far as I know, yes, though to be fair I just use the custom API module 95% of the time regardless, so it's possible I'm wrong.

Avatar

Level 8

Is it possible to make $$FIRST = X and $$LIMIT = Y dynamic so that I can retrieve the next set of 2000 records and continue until all the records are obtained?

Avatar

Correct answer by
Community Advisor

Yes, here is my SOP process.

 

1. Make a "count" call, where the query matches your final query

2. Create a variable, I call it maxResults usually

3. Place a repeater module that starts at 0, in the repeats put ciel({count}/{maxResults}), and then in steps, put maxResults

4. Place your "real" query, mapping i from the repeater into $$FIRST and maxResults into $$LIMIT

 

Note, I did all of that from memory so it may not be 100% on verbiage but the process is correct, if you can't figure it out let me know and I'll pull out my notes.

 

Edit: Corrected mistake in reply

Avatar

Level 8

Thank you for the solution. It worked for me!

Just one minor correction, in step 3, the repeater module should start with 0 instead of 1. Else, it would ignore the first record.

Thanks again!

Avatar

Community Advisor

Glad to hear it! Thanks for the correction, I'll update my response so it is correct.

Avatar

Level 6

As others have discussed, this is pagination.  However, you have to be thoughtful about how you manage the paginated values.  I filed an enhancement request around this that contains some more details.

The very first (or second) node in any of my scenarios is almost always a Set Variable where I specify a variable called Page Size and set it to a value like 2000.

Then I'll do the same GET:objCode/count call that's been discussed.

Next comes the Repeater, though I'll do two different things from what others have said.  There are two options you can consider doing:

  1. Loop where the Repeater's i variable is the page number.  (Ex: 0, 1, 2, 3, ...)
  2. Loop where the Repeater's i variable is the record count. (Ex: 0, 2000, 4000, 6000, ...)

I generally stick with option #1, so then my formula becomes:

  • Initial value: 0
  • Repeats: min(10000 ; ceil(body.data.count Page Size))

You need the min() protection or else you get an error if the number of pages exceeds.  (Unless you WANT the scenario to fail, and then you can safely ignore.)

Finally, the paginated GET:objCode/search then has to use the following:

  • $$FIRST = iPage Size (or use i if using option #2 above)
  • $$LIMIT = Page Size
  • entryDate_1_Sort = asc
  • ID_2_Sort = asc

You need to include the two _Sort parameters to ensure nothing gets injected into the overall result set.  Choose different parameters if you need for your business logic, but you have to always ensure that the record will end up on the same page else you'll either duplicate or drop data when doing the query.

(By the way, this is also why you can't reliably do paginated queries using lastUpdateDate_Sort=asc, because the ordering of things can flop around during all these asynchronous executions.  This is the root of why I filed the request for cursor-paged pagination.)

Hope that helps.

Avatar

Level 8

Got it! I will avoid sorting by lastUpdateDate as per your suggestion. However, could you explain the need for entryDate sort? Isn't sorting by IDs sufficient because object IDs are unique?

Plus, you suggested using min() because the number of pages may exceed. Is there a limit on the number of pages that can be accessed?

Thanks!

Avatar

Level 6

Re: min()

The Repeat module only allows for up to 10,000 iterations and the $$LIMIT on each page is 2,000 records per page.  So yes, there's a limit of 10000 times 2000 = 20,000,000 records that you can access.

Re: entryDate Sorting

You want to ensure things remain in a consistent state.  Remember that IDs are nothing more than GUIDs.  They have the form similar to dfc3f4bd-00d5-4b1b-b398-c83bfefac647.  Think about sorting on something like this -- it's a textual sort, but there's nothing that guarantees the next GUID will alphabetically sort behind the previous one.  As such, GUIDs can't be used for primary sorting.

Entry date is similar to the concept of "Creation Date".  (It isn't, as the HOUR object shows.) As such, because it's kinda impossible to create things in the past, it's the safest way you can handle this.

Note that deletions actually cause problems with this strategy, as Workfront hard-deletes the record (removing entirely) instead of soft-deleting (flagging in a metadata field as "is deleted").  So you do have to be discerning with how you do pagination with Workfront API queries.