Expand my Community achievements bar.

Updating the documentation with a good practice.

Avatar

Level 6

6/13/21

In the documentation, a bad practice for developper is unwantedly suggested:
https://experienceleague.adobe.com/docs/campaign-classic/using/configuring-campaign-classic/api/data...


To limit the number of records returned by the query to 100:

<queryDef schema="nms:recipient" operation="select" lineCount="100">
...
To retrieve the next 100 records, run the same query again, adding the startLine attribute.

<queryDef schema="nms:recipient" operation="select" lineCount="100" startLine="100">
...


In fact, a dev should never use the startLine attribute as it contribute to a loss of performance:

exemple with a SQL translation of the 1rst example:

select top 100 from nmsRecipient

You'll get the 1rst 100 records... No issue about that

 

Now is the SQL translation of the 2nd querydef:

select top 200 from nmsRecipient

=> as the startLine begin at 100, the "top" will start at 100 and add the line count attributes (100) ... Then the JS engine will keep the last 100 records

This doc is used by developper to create code in order to loop over all the records of a table as queryDef are limited to 10000 records usually:

Imagine a loop with a lineCount of 10K... If you want to use a queryDef in order to operate on a 100K table: it will creates 10 sql queries, retrieving each time 10k more results leading to retrieve 550K results instead of 100K and each requests to be more greedy on performance (10+20+30+40+50+60+70+80+90+100).

 

The good practice should be then to use an Id (directly the id of the schema or a rowId from a wkf work table if possible) in the where clause in order to replace this "startLine" attribute:

<queryDef schema="nms:recipient" operation="select" lineCount="100">
<where>
<condition expr="[@id] > " var.lastProcessedId />
</where>
</queryDef>
I observed in the past some workflow JS activity with the "startLine" implementation that were optimized from 4h down to 2 min.

 

Placing a warning in the documentation in order to inform that they should never use startIndex as a way to loop over a schema would be really helpfull, specially for beginners.

 

Ps: TY @CedricRey  for this tip months ago

5 Comments

Avatar

Community Advisor

6/13/21

Hi,

 

The queryDef's translate into ordinary sql, js isn't used (outside interaction which doesn't have startLine attrib).

For most* rdbms' this would add 'limit n offset m' along with whatever optimizations available, such as 'top' hint for ms sql server.

The reason your queries are running faster when you add indexed conditions is they aren't scanning the full table for results.

 

Thanks,

-Jon

 

*Most since mysql uses diff syntax but is long deprecated as an acc install option. I think db2 is also weird but also deprecated now?

Avatar

Level 6

6/14/21

Hi @Jonathon_wodnicki 

Didn't know for this syntax on most RDBMS.

But carefull: there is no index on WKF worktables, even PK, (checked on PostGres & msSQL RDBMS). Reason is that AC assume that there is no other columns than the PK columns and having an index on a single column table doesn't help

Avatar

Community Advisor

6/15/21

Hi,

 

@LaurentLam  you forgot something very important in your good practice example : the order by clause

 

<queryDef schema="nms:recipient" operation="select" lineCount="100">
<where>
<condition expr={"[@id] > " + var.lastProcessedId} />
</where>
<orderBy>
<node expr="[@id]" />
</orderBy>
</queryDef>

 

Cédric

Avatar

Level 6

6/15/21

@CedricRey 

You're right about specifying it in general (so using a work table).

But in my specific example, not needed: as this is a schema integer PK, the "order by" doesn't need to be used as it will automatically use the default index (ASC order by default)

 

@Jonathon_wodnicki , after several search the "offset" SQL format is not performant as using a condition in the where clause (and an order by if needed) even on PG/Oracle/MsSQL...

 

So using this technic trick to loop on a table is not a good practice