Expand my Community achievements bar.

Dive into Adobe Summit 2024! Explore curated list of AEM sessions & labs, register, connect with experts, ask questions, engage, and share insights. Don't miss the excitement.
SOLVED

Bulk Import Assets in AEM DAM

Avatar

Level 4

Hi

We have requirements to upload the 1.5 million assets (including Image, multimedia, ppt, doc etc.) in to the AEM DAM including the metadata information as well.

1. What is the best approach and best way to perform the bulk import activity?

2. Should we set up our environment with MongoMK (and DB MonoDB) or should we use any other repository?

Thanks

~S

1 Accepted Solution

Avatar

Correct answer by
Employee

Hi Samer,

the reason for using MongoMK should be based on your scalability/HA requirements[0]. 

By default, you would use TarMK, and the assets would be included in the repository files. However, for large DAM installations we recommend using an external data store which can be done with TarMK or MongoMK.[1]

Either way, you should batch your uploads to reduce the strain on the system, consider doing this during this when user activity is at it's lowest. Another option is to stop the DAM Upload Asset workflow, during uploads and have a script that is run after the uploads. Have a look at the comments section of [2]

You can always write your own java application that uploads content to AEM calling the slingpostservlet passing in the correct parameters location, properties etc

Whatever you do, you must test out your full upload on a test system, to determine the best approach, e.g. batch size, and any optimizations that should be configured such as [3]

Regards,

Opkar

[0] https://docs.adobe.com/docs/en/aem/6-1/deploy/recommended-deploys.html

[1]https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html#Data Store Configurations

[2]https://edivad.wordpress.com/2013/01/07/installing-big-packages-in-cq5crx/

[3] http://cq-ops.tumblr.com/post/122108616399/the-importance-of-javaiotmpdir-in-dam-asset

View solution in original post

6 Replies

Avatar

Level 10

For this many nodes - it would be wise to use mongo DB.  Also - importing this many nodes into the DAM may impact performance. I will get support to look at this thread. 

Avatar

Level 4

OK.

Does AEM 6.1 provides any bulk Import Utility or Service OTB?

Thanks

~S

Avatar

Level 10

Have a file data store. Plan to use offloading Or inject the asset batch wise & make sure next batch is executed only after first batch is complete.  

Additionally plan to disable onling compaction till import is complete.

Avatar

Level 9

In my personal opinion, we should be having a separate data store for this much digital assets. Maintenance of million assets which in one repository isn't going to be easy. The Cloud could be a one option. 

For data store config: https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html

Jitendra

Samer Yadav wrote...

Hi

We have requirements to upload the 1.5 million assets (including Image, multimedia, ppt, doc etc.in to the AEM DAM including the metadata information as well.

1. What is the best approach and best way to perform the bulk import activity?

2. Should we set up our environment with MongoMK (and DB MonoDB) or should we use any other repository?

Thanks

~S

 

Avatar

Level 4

1. Are there any batch size recommendation? I means what should be the batch size around?

2.  As per the multiple files upload articles it seems that we need to create the xml file first with list of all assets name then we can pass them to the OSGI service (this is little extra effort at dev level because client have all assets in some other hotfolders or repository). Instead of this are there any options to create the component and pick up the source folder name , destination folder name via dialog or via properties file and pass it to one servlet (that is deployed on AEM) and via this servlet we can call to the other servlet that will upload the assets.

any suggestion?

Avatar

Correct answer by
Employee

Hi Samer,

the reason for using MongoMK should be based on your scalability/HA requirements[0]. 

By default, you would use TarMK, and the assets would be included in the repository files. However, for large DAM installations we recommend using an external data store which can be done with TarMK or MongoMK.[1]

Either way, you should batch your uploads to reduce the strain on the system, consider doing this during this when user activity is at it's lowest. Another option is to stop the DAM Upload Asset workflow, during uploads and have a script that is run after the uploads. Have a look at the comments section of [2]

You can always write your own java application that uploads content to AEM calling the slingpostservlet passing in the correct parameters location, properties etc

Whatever you do, you must test out your full upload on a test system, to determine the best approach, e.g. batch size, and any optimizations that should be configured such as [3]

Regards,

Opkar

[0] https://docs.adobe.com/docs/en/aem/6-1/deploy/recommended-deploys.html

[1]https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html#Data Store Configurations

[2]https://edivad.wordpress.com/2013/01/07/installing-big-packages-in-cq5crx/

[3] http://cq-ops.tumblr.com/post/122108616399/the-importance-of-javaiotmpdir-in-dam-asset