Hi All,
this is rather generic question about how to best handle a high number of search results.
The Workfront Search module in Fusion gives you the option to set a limit on the number of search results returned, but what if I need all matching records and the number can be high (several hundreds)? For example, I need to pull a list of all Assignments in a project and process each of them. For a large and complex project with many tasks this list can be extensive.
How can I make sure that all matching records are returned and processed, while not risking to reach the 40 minutes runtime limit, etc.
Is there a way to obtain and process them in batches?
Any ideas are appreciated.
Thanks,
Tibor
解決済! 解決策の投稿を見る。
トピックはコミュニティのコンテンツの分類に役立ち、関連コンテンツを発見する可能性を広げます。
Hi @tibormolnar here is what I've used before:
Example:
Hi @tibormolnar here is what I've used before:
Example:
Hi @tibormolnar,
To spare you the day I recently lost when api-unsupported started returning needle duplicates among such haystack batches, I strongly urge you to ensure the request you use in step 3 (and optionally, step 1) from @Sven-iX is SORTED to avoid such duplicates and future proof your work.
Our www.atappstore.com lowest level plumbing now inspects every such API call to Workfront prior to execution and if the count exceeds the batch size but there is no "_Sort=" within, adds a magnetic "&ID_Sort=asc" to ensure Good Behavior.
Regards,
Doug
OMG - YES - sorting is a must, thank you for adding, @Doug_Den_Hoed__AtAppStore
Had that experience too!
Seems weird the API doesn't already return a default sort...
表示
返信
いいね!の合計
Thanks for this Sven!
It all makes sense conceptually. I guess I just need to learn first how to call a scenario from within another scenario (or from outside of Fusion). If you happen to know where I best start reading about that, I appreciate the link. Otherwise I'll dig in the Community topics.
Thanks,
Tibor
表示
返信
いいね!の合計
Oh that part is pretty simple:
in worker scenario, you start with a webhook. Copy the hook URL
In calling scenario you have a HTTP module and set the URL to the hook URL
pass what you need to pass as fields
表示
返信
いいね!の合計
Ah, I see. I learnt something new today.
(I've only used Workfront event listener webhooks so far, now I just discovered the "Webhook" modules.)
Thanks!
表示
返信
いいね!の合計
@Sven-iX, do you have an example of such batch split setup?
表示
返信
いいね!の合計
Hi @viovi
Go ahead and try it. I'll help you along the way.
表示
返信
いいね!の合計
Our data is coming from file with 1000+ entries that we need to split into batches as we have an issue with hitting 40 minutes runtime and other limits.
I tried to setup Repeater, and it seemed to define batch size and Number of steps correctly, but it is still processing each step as individual operation (1 operation = 1 collection of values/ bundle). Also, it looks like it all was just repeated 3 times and not split into 3 parts, i.e., 3000+ bundles instead of 1000+) when passing data to another scenario.
How to group the bundles and pass them to another scenario to process in 3 batches, so, e.g., 1st batch would have bundles 1 to 500, 2nd 501-1000, 3rd - all the rest?
表示
返信
いいね!の合計
Ok, I figured this out and was able to split it into 3 batches:
So, another question is how to pass them one by one (one at a time) to another scenario for processing?
For example, pass batch 1 to another scenario, when it's completed its processing there, pass next batch 2, process, then batch 3, process
Any ideas?
表示
返信
いいね!の合計
Hi @viovi,
I haven't tried this myself yet, but I assume the idea here is that the 1st scenario that creates the batches, does not wait for the 2nd scenario to complete processing the 1st batch, before sending the 2nd batch to that, because then its total runtime would be just as long as if it was doing the whole processing (unsplit) itself.
Instead, the 1st scenario would create the N batches, trigger the 2nd scenario N times, then end. The 2nd scenario would then be running N times, potentially partially parallel.
As for how to pass the batches between the 1st and 2nd scenario, read this article:
Basically, your 1st scenario would send a HTTP request to the URL of the webhook which is the 1st module in your 2nd scenario. The data can be passed on in different ways. Considering that your data set is large, you should probably use JSON.
I hope this helps,
Tibor
Thank you, @tibormolnar. That's what I was trying to implement too.
Currently I'm a bit stuck with the following: my HTTP module is sending out a json with batch entries (e.g., 1-500), but the 2nd scenario is still receiving them not as batch, but as individual entries and I can't figure out why it's happening
表示
返信
いいね!の合計
@viovi, have you tried enabling the "JSON pass-through" option for your webhook in the 2nd scenario?
With this setting the output of the webhook module will be a string, same as the payload sent to the webhook. Then you can do with that what you want, e.g. parse it as JSON.
Does this help?
Thank you, I think that was my issue with json formatting not passing properly.
表示
返信
いいね!の合計
Sorry I've been sidetracked.
You already found a way but put simply cerate batches by grouping the bundles you iterate through
This means: As we run through all the bundles, we group bundles into batches of a certain size.
Each of these batches we then push to the second scenario.
Here's a forced example: I use a repeater to create 100 bundles, and in the aggregator I group them by 20s.
Convert the batch to JSON, and send it to the webhook
since we send the array as a named object the webhook receives a property "batch" that is an array.
Thank you, I ended up with the similar setup, just did not realize that I can pass batchNo in the heading.
表示
返信
いいね!の合計