Level 9

Solved

Advanced workflow data enrichment

Forum|Forum|6 years ago
March 9, 2020
5 replies
14920 views

Hi All - I want to do some advanced data enrichment for all targets in a workflow that ultimately requires a decent amount of complex Javascript. I have two large switch + if/then/else statements that populate three columns based on Recipient attributes (two text and one bit.ly link) that get sent on the email or SMS.

Right now, I am using a JS queryDef select to get all records then iterate over them one by one to update the corresponding fields for that target. This then updates the temporary schema in the workflow.

This is an issue for two reasons:

Over 10,000 records needs to page the queryDef as the maximum individual select can only be 10k
At some point over 10k records you run out of JS memory (even after increasing the JS memory)

Adobe Support says its a bad practice, but haven't suggested an alternative as this is probably Consulting

Is there a better way to do advanced enrichment during a workflow with >10k records?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

Best answer by Darren_Bowers

I found this which is exactly what I want to do (but with queryDef paging): https://technowide.net/2017/05/19/iterating-custom-functions-record/

But I can't see where the TempResponse temporary schema is created in the example. Is this created as a normal Schema beforehand? I think I am missing something here...

Milan_Vucetic

Level 9

Hi @darren_bowers ,

if you already have JS and issue with JS memory/JS records limit, you can select records in batch.

Set batch to 5K or more (but lower than 10K) and you will process as many records as you want without errors.

var currentStartLine= 0;
var recordsToFetchPerBatch = 5000;
var loop = true;

while(loop){
var query = xtk.queryDef.create(
<queryDef schema="xtk:some_schema" operation="select" lineCount={recordsToFetchPerBatch} startLine={currentStartLine} >
<select>
<node expr="@node"/>
..
</select>
</queryDef
);
result = query.executeQuery();
if(Object.keys(result).length>0){
//do stuff
//
currentStartLine = currentStartLine + recordsToFetchPerBatch;

}else
loop = false;

}

Regards,

Milan

Darren_BowersAuthor

Level 9

Thanks Milan - I was doing this previously but still found that after a few batch loads it ran out of memory. I think its because some of the objects dont release their JS memory properly. Its hard to diagnose because of the lack of tools available to profile the memory.

Darren_BowersAuthor

Level 9

Hi @milan_vucetic - attaching some example code here as I cant do it in the comment thread.

If you comment out the line that does the write back (datalayer.writeRecord(record);) , then the query will load all records properly and in the correct batches. If you write back to the temporary schema, then the select will miss a bunch of records.

/**
 *
 * Adds the bit.ly links and additional notes onto the schema records
 * 
 **/

//Get the schema name (skipping namespace)
var schemaName = vars.targetSchema.substring(vars.targetSchema.indexOf(":") + 1);

var datalayer = {
  /** 
   *  Get all Movers records in targetSchema
   *
   **/
  get: {
    records: function( recordsToFetchPerBatch,currentStartLine ) {
      var query = xtk.queryDef.create(  
        <queryDef schema={vars.targetSchema} operation="select" lineCount={recordsToFetchPerBatch} startLine={currentStartLine}>
          <select>
            <node alias="switchDocumentId" expr="@switchDocumentId" />
            <node alias="serviceOrderId" expr="@serviceOrderId" />
            <node alias="segmentCode" expr="@segmentCode" />
            <node alias="completedDate" expr="@completedDate" />
            <node alias="scheduledDate" expr="@scheduledDate" />
            <node alias="bitly" expr="@bitly" />
            <node alias="edmLink" expr="@edmLink" />
            <node alias="divisionId" expr="@divisionId" />
            <node alias="customerType" expr="@customerType" />
            <node alias="IrDescription" expr="@IrDescription" />
            <node alias="IrMessage" expr="IrMessage" />
          </select>
          <where>
            <condition expr="@switchDocumentId IS NOT NULL" bool-operator="AND" />
            <condition expr="@bitly IS NULL" bool-operator="AND"/>
          </where>
        </queryDef>
      ).ExecuteQuery();

      return query;
    }
  },

   /**
    * Write record back to temp schema
    *
    */
   writeRecord: function ( data ) {
      try {
        sqlExec("update " + vars.tableName + " set sbitly='" + data.bitly + "', sedmLink='" + data.edmLink + "', sIrDescription='" + data.IrDescription + "', " + "mIrMessage='" + data.IrMessage + "' where sswitchDocumentId = '" + data.switchDocumentId + "' AND sserviceOrderId = '" + data.serviceOrderId + "';");
      } catch (e) {
        logError("Something bad happened in writeRecord: " + e);
      }
   }
}

var businessLogic = {

  /**
   * Logs all records in dataset
   *
   *
   */
  displayAllRecords : function(records, schemaName) {
    logInfo("Dumping all records:");
    for (var key in records[schemaName]) {
        if (records[schemaName].hasOwnProperty(key)) {
          logInfo(key + " -> " + records[schemaName][key]);
        }
    }
  },
  /**
   * Finds the bit.ly link for the given record
   *
   **/
  findBitlyLinks : function(records, schemaName) {
    logInfo("findBitlyLinks - starting...");
    for (var key in records[schemaName]) {
  
		var record = records[schemaName][key];
		
		//some business rules in here to determine unique bit.ly and edmLink for each record
		
		record.bitly = "XXXXXX";
		record.edmLink = "http://www.somelink.com/blah.html";

     }
  },
  /**
   * Create the notes for each set of valid segment codes
   *
   **/
  createIRNotes : function(records, schemaName) {

    for (var key in records[schemaName]) {
  
      var record = records[schemaName][key];
      
      
      switch (segCode) {
      
        case 'MySegCode':
          record.IrDescription = 'Something';
          record.IrMessage = 'Something Descriptive';
          break;
          
        case 'MyOtherSegCode':
          record.IrDescription = 'Something Else';
          record.IrMessage = 'Something Else Descriptive';
          break;
          
        
        default:
          record.IrDescription = 'Missing';
          record.IrMessage = 'Missing';
        }
		
		//write the record back
        datalayer.writeRecord(record);
        
       }
  }

}

 
var currentStartLine = 0;
var recordsToFetchPerBatch = 200;
var loop = true;
var records = {};

while(loop){

  //get the records from the targetSchema
  logInfo("Getting batch from " + currentStartLine + " to " + (currentStartLine+recordsToFetchPerBatch));
  records = datalayer.get.records(recordsToFetchPerBatch,currentStartLine);
  logInfo("Got " + Object.keys(records).length + " records");
  if( Object.keys(records).length > 0 ) {
    //do the Bit.ly links
    businessLogic.findBitlyLinks(records, schemaName);
    //create IR Notes (this is where the write happens)
    businessLogic.createIRNotes(records, schemaName);
    //increment loop 
    currentStartLine += recordsToFetchPerBatch;
    records = {};
  } else {
    logInfo("Exiting..");
    loop = false;
  }
}

Milan_Vucetic

Level 9

Hi @DarenOrigin, I am not able to test it but from the code it seems that temporary schema changes rowIds after your update and then "confuse" the loader. Are you able create a custom table/schema and insert processed records in it? That should work for you. Regards, Milan

Darren_BowersAuthor

Level 9

[deleted due to double post - the forum post ordering is broken]

Darren_BowersAuthorAccepted solution

Level 9

I found this which is exactly what I want to do (but with queryDef paging): https://technowide.net/2017/05/19/iterating-custom-functions-record/

But I can't see where the TempResponse temporary schema is created in the example. Is this created as a normal Schema beforehand? I think I am missing something here...

Milan_Vucetic

Level 9

Hi @DarenOrigin, from this page I can see that TempResponse is custom table which means it is created as normal schema. It is not representing a temporary schema. Just create a new table and address it to save enriched records and will work for you. Creation of this table is not mentioned in the article, you have to do it on your own. Regards, Milan

F

FlorianHaas

It sounds like you're running into performance limitations with your current approach to data enrichment in your workflow. Here are a few alternative approaches you could consider:

Batch processing: Instead of processing all records at once, you could split the records into smaller batches and process each batch individually. This would help reduce the memory footprint of your JavaScript code and also reduce the size of the queryDef select. You could use a loop to process each batch of records and update the corresponding fields in the temporary schema.
Use a database: You could consider storing the data in a database, such as MySQL or PostgreSQL, and using a database query to retrieve the data and perform the enrichment. This would allow you to retrieve large amounts of data efficiently and also make it easier to maintain and update the data.
Use a server-side script: Another option would be to use a server-side script, such as PHP or Python, to perform the data enrichment and then pass the results back to the workflow. This would help offload the processing from the workflow and also allow you to take advantage of the more powerful processing capabilities of the server.
Consider alternative CRM data enrichment techniques: You could consider alternative data enrichment techniques, such as using lookup tables, data normalization, or data aggregation, to simplify the data enrichment process. These techniques can help reduce the complexity of your JavaScript code and also make it easier to maintain and update the data.

It's important to consider the specific requirements and constraints of your workflow when choosing the best approach for data enrichment. It's also a good idea to seek professional advice if you're unsure of the best way to proceed.

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded