Expand my Community achievements bar.

SOLVED

Architectural question: Optimal Workflow Approach for High-Volume Rendition Generation in DAM

Avatar

Adobe Champion

Issue & Background:

We manage a DAM system with over 3 million assets, where every uploaded asset goes through the DAM Update Workflow. This workflow includes several steps, such as generating a 4K JPEG rendition and a few OOB thumbnail renditions.

A new requirement emerged where assets within a specific folder—/product-assets (containing over 1 million assets)—now need an additional 4K PNG rendition along with the existing 4K JPEG rendition.

To prevent adding extra load on the DAM Update Workflow, we created a separate Rendition Maker Workflow specifically for this 4K PNG rendition and applied it only to the /product-assets folder via a launcher.

As a result, assets in this folder now go through two workflows:

  1. DAM Update Workflow (standard process)
  2. Rendition Maker Workflow (4K PNG rendition generation)

However, we have observed an issue (based on assumption) that running two workflows simultaneously on assets is leading to:

  • Some renditions getting corrupted or not being generated properly
  • Replication queue getting blocked when these assets are published

 

Potential Solutions:

Solution 1:

  • Eliminate the Rendition Maker Workflow entirely.
  • Modify the DAM Update Workflow by adding a custom process step to generate the 4K PNG rendition only inside the /product-assets folder.
  • This ensures that other folders are not impacted, making the workflow lightweight for other folders in dam.
  • The rendition generation will only happen for Ecommerce and Marketing images inside /product-assets.

Solution 2:

  • Retain the Rendition Maker Workflow, but enhance it to include 4K JPEG and all other OOB thumbnail renditions which are in dam update workflow, so that this folder also will have those renditions created, ut by Rendition maker workflow
  • Apply this updated workflow only to the /product-assets folder via launcher, ensuring that the DAM Update Workflow does not run on this folder.
  • This guarantees that only one workflow runs per asset in this folder.

 

Key Question: From a performance standpoint, which solution is optimal when processing large volumes (20,000+ workflows at once)?

  • Should we go with Solution 1, where we use a custom process step with path-based conditions inside the DAM Update Workflow?
  • Or should we go with Solution 2, where we rely on launcher-based conditions to ensure that only one workflow runs at a time?

Looking forward to expert insights on which approach is better for scalability and performance.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @P_V_Nair 

In addition to @pat-lego comments - 

 

Solution 1 has the following advantages 

  • Only one workflow per asset, reducing contention.
  • Keeps all rendition logic centralized.
  • Performance-First: Fewer workflows, less overhead, no extra launcher logic.

  • Scalability: Easier to run in large batches (20K+) since you’re not relying on two concurrent workflows.

  • Simplicity: Cleaner and easier to debug.

 

The Solution 2 has following drawback 

  • Launcher evaluation is expensive on high-volume folders.
  • Two workflows per asset increases the load on the workflow engine unless DAM Update is disabled for /product-assets.
  • Must be very careful with exclusions in the DAM Update launcher, otherwise both workflows could still run.
  • Launcher-based workflows are often less performant under scale.

 

Arun Patidar

AEM LinksLinkedIn

View solution in original post

9 Replies

Avatar

Community Advisor

Hi @P_V_Nair 

I would go with option 1 with conditional execution of PNG 4K generation process based on some path config + a bulk update job to create 4K PNG rendition for existing assets.

Arun Patidar

AEM LinksLinkedIn

Avatar

Adobe Champion

@arunpatidar  Please help me understand why Solution1  is better than 2?

Avatar

Correct answer by
Community Advisor

Hi @P_V_Nair 

In addition to @pat-lego comments - 

 

Solution 1 has the following advantages 

  • Only one workflow per asset, reducing contention.
  • Keeps all rendition logic centralized.
  • Performance-First: Fewer workflows, less overhead, no extra launcher logic.

  • Scalability: Easier to run in large batches (20K+) since you’re not relying on two concurrent workflows.

  • Simplicity: Cleaner and easier to debug.

 

The Solution 2 has following drawback 

  • Launcher evaluation is expensive on high-volume folders.
  • Two workflows per asset increases the load on the workflow engine unless DAM Update is disabled for /product-assets.
  • Must be very careful with exclusions in the DAM Update launcher, otherwise both workflows could still run.
  • Launcher-based workflows are often less performant under scale.

 

Arun Patidar

AEM LinksLinkedIn

Avatar

Adobe Champion

Thank you @arunpatidar  and @pat-lego . Appreciate your insights.

Avatar

Employee

I would agree with @arunpatidar , I would use only 1 workflow. Workflows are independent threads that do not know of each other's existence. This causes issues in Oak at times (especially when two of them run at the same time and operate on the same paload). 

 

That being said, you could try to use an OR split and perhaps use some conditional logic (if the path looks like this, go down this fork of the workflow) otherwise go down the other side of the workflow.

 

How I would reprocess these assets, I would create a sling scheduler (not a sling job - if the scheduler gets missed not a big deal it'll run the next day) and let it run at night (have it process up to X assets per run and then shutdown), have it walk the tree of assets that need to be reprocessed (ie. /content/dam/product-assets) and process up to 100 (ballpark guess) workflows at a time. You can check the status of the workflows by using https://developer.adobe.com/experience-manager/reference-materials/6-5/javadoc/com/day/cq/workflow/W... and using https://developer.adobe.com/experience-manager/reference-materials/6-5/javadoc/com/day/cq/workflow/e...

Once the workflow is done processing, set a custom metadata property like dam:productAssetsUpdated = {Boolean}true so that when the scheduler completes at the end of its cycle, it knows what was processed. 

 

This means that the scheduler would wake up at night (avoiding peak authoring time), process 100 assets at a time and monitor the workflows, as some workflows complete push more through. Once it hits the X limit, it would stop, and then the process would begin the next night, and it would skip the ones it already processed (because the metadata property would be present). This would allow for the assets to get processed on off times and give you the ability to monitor the next day what was processed. Once they are all processed you can delete the scheduler. 

 

Avatar

Adobe Champion

Thank you @pat-lego  and @arunpatidar .

As you mentioned, @pat-lego , we've already implemented a scheduler that handles the reprocessing of assets whose renditions were missed but still got published, potentially blocking the replication queue. This monitoring scheduler identifies such assets and reprocesses them accordingly.

My main question, however, was about the performance comparison between Solution 1 and Solution 2. Both ensure that only one workflow runs, but the key difference lies in how the condition is applied:

  • Solution 1 includes a conditional check within the workflow itself for specific paths.

  • Solution 2 uses a launcher that triggers the workflow based on path conditions.

I was curious to understand which of the two would offer better performance and optimization.

Avatar

Employee

Solution 2 is going to be more expensive, there is a cost to creating a workflow. 

A workflow is a sling job that needs to be created and then trigger the creation of a workflow instance within the oak repository. This in itself is a lot more expensive then doing a conditional check. 

Avatar

Level 1

Hey @pat-lego and @arunpatidar . Thank you for the detailed response. 
Since our DAM Update WF is highly customized including few bespoke rendition creation steps.
Question: Do you think adding path-based conditions within these custom steps could result the WF to run longer than usual impacting author performance?

E.g.  We have a daily feed that ingests average 10K delta assets. The customized DAM Update WF runs on all these deltas. With path-based conditions will it take more time for the WF to finish processing these assets?
Note- We additionally have a Move WF that moves these assets from 1 folder path to another after DAM Update WF finishes

Avatar

Adobe Champion

Just an additional idea, not sure if it is applicable here:

 

For such asset volumes and considering potential performance issues, I wonder whether offloading the renditions process would make more sense via custom A.IO actions and asset processing profiles. See https://experienceleague.adobe.com/en/docs/experience-manager-learn/cloud-service/asset-compute/depl... 

 

It does add an extra layer in the process, but I believe you can apply the processing profiles on a per-folder basis and fine-tune when the particular renditions are required, avoiding redundant processing.