I have a route in my scenario where the objective is to truncate all strings in a column that are 2000 characters or greater, and I'm hoping someone can suggest a more elegant solution.
Right now, I use a Search module as the first module in the route and it is searching just for ANY entry in the datastore that has a unique key (i.e. EVERY row in the data store).
Then I tried to use an iterator module to go through - what I'd thought was - every search result, and only if the string is 2000+ chars - depending on the row type - a type-specific variable would be set to create a truncated string to ultimately overwrite the value in the original string field of the datastore.
However this approach has produced very problematic results. I learned that the iterator module processed each search result in an array separately, creating individual bundles. And the bundles kept on proliferating so that even if there were 161 rows in the datastore, it was processing 12,000+ bundles. According to Adobe, right now, the module is set to return all records from the data store as long as the unique key exists. This means that on the first pass, one bundle was returned, on the second pass, two bundles, on the third, three, and so on—until operation 161 returns 161 bundles.
Is there a better way for me to do the initial search of the data store so my results are only the rows where the length of the string field is 2000+ and then to go row-by-row of those results without the problem of the bundles proliferating to the degree they did? Someone suggested capturing the protocolNumber in a set variable and retrieving it just before module 166 I'd be allowed to refine the filter to only considers the most recent record. This would then reset with the next module and avoid the proliferation. I haven't tried that approach yet, and before I begin, thought to ask if the community had any other ideas?
Thank you for your thoughts, and let me know if I can clarify my problem any better for you.
Solved! Go to Solution.
Views
Replies
Total Likes
Hi @Mylah_D
I think you're on the right track here.
All "Search" modules have an implicit iterator, ie their output is a list of bundles, not an array. So you don't actually need an iterator here.
Remove that and the rest looks good to me. You may want to uncheck "insert missing record" though.
Since it's 4 specific fields you could reduce the set you operate on by adding filters:
I had meant to also share the screenshot of the truncating logic downstream of the Data Store Search module, which runs "for each" search result:
This is the route in its entirety:
Views
Replies
Total Likes
This is the last module used to overwrite the type-specific string:
Views
Replies
Total Likes
Hi @Mylah_D
I think you're on the right track here.
All "Search" modules have an implicit iterator, ie their output is a list of bundles, not an array. So you don't actually need an iterator here.
Remove that and the rest looks good to me. You may want to uncheck "insert missing record" though.
Since it's 4 specific fields you could reduce the set you operate on by adding filters:
Thank you, @Sven-iX.
Thank you @Sven-iX.
All "Search" modules have an implicit iterator ... <insert mind-blown emoji here>.
This explains so much. I've already implemented your suggestion and have seen the vast improvement - no timing out and the scenario completing successfully. (Now to look through all my scenarios to remove iterators where they were not necessary. Appreciate the assist there!! And the additional filter suggestion was very helpful as well. I was trying to apply filters inside the Search module to reduce the set of results there, but didn't think to add a filter in the connector beyond the Search module. Thank you.
Views
Replies
Total Likes
Hi @Mylah_D that's great - thanks for responding!
FYI: If Fusion populates the data store you could check the field content length at that moment and set flags in the store field1Oversized, field2Oversized... - or even create truncField1, truncField2 and populate with the truncated value (depends on the use case - if storage is transient #2 might make sense; if it's semi permanent I'd go with #1).
Ideally you don't pull ALL records if you need to operate only on a subset (because the search will maximally return 2000 results IIRC).
Another great suggestion. I will do that, @Sven-iX . Thanks!
Views
Replies
Total Likes