Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

SOLVED

Bulk Import Assets in AEM DAM

sameryadav
Level 4
Level 4

Hi

We have requirements to upload the 1.5 million assets (including Image, multimedia, ppt, doc etc.) in to the AEM DAM including the metadata information as well.

1. What is the best approach and best way to perform the bulk import activity?

2. Should we set up our environment with MongoMK (and DB MonoDB) or should we use any other repository?

Thanks

~S

1 Accepted Solution
Opkar_Gill
Correct answer by
Employee
Employee

Hi Samer,

the reason for using MongoMK should be based on your scalability/HA requirements[0]. 

By default, you would use TarMK, and the assets would be included in the repository files. However, for large DAM installations we recommend using an external data store which can be done with TarMK or MongoMK.[1]

Either way, you should batch your uploads to reduce the strain on the system, consider doing this during this when user activity is at it's lowest. Another option is to stop the DAM Upload Asset workflow, during uploads and have a script that is run after the uploads. Have a look at the comments section of [2]

You can always write your own java application that uploads content to AEM calling the slingpostservlet passing in the correct parameters location, properties etc

Whatever you do, you must test out your full upload on a test system, to determine the best approach, e.g. batch size, and any optimizations that should be configured such as [3]

Regards,

Opkar

[0] https://docs.adobe.com/docs/en/aem/6-1/deploy/recommended-deploys.html

[1]https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html#Data Store Configurations

[2]https://edivad.wordpress.com/2013/01/07/installing-big-packages-in-cq5crx/

[3] http://cq-ops.tumblr.com/post/122108616399/the-importance-of-javaiotmpdir-in-dam-asset

View solution in original post

6 Replies
smacdonald2008
Level 10
Level 10

For this many nodes - it would be wise to use mongo DB.  Also - importing this many nodes into the DAM may impact performance. I will get support to look at this thread. 

sameryadav
Level 4
Level 4

OK.

Does AEM 6.1 provides any bulk Import Utility or Service OTB?

Thanks

~S

Sham_HC
Level 10
Level 10

Have a file data store. Plan to use offloading Or inject the asset batch wise & make sure next batch is executed only after first batch is complete.  

Additionally plan to disable onling compaction till import is complete.

Jitendra_S_Toma
Level 9
Level 9

In my personal opinion, we should be having a separate data store for this much digital assets. Maintenance of million assets which in one repository isn't going to be easy. The Cloud could be a one option. 

For data store config: https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html

Jitendra

Samer Yadav wrote...

Hi

We have requirements to upload the 1.5 million assets (including Image, multimedia, ppt, doc etc.in to the AEM DAM including the metadata information as well.

1. What is the best approach and best way to perform the bulk import activity?

2. Should we set up our environment with MongoMK (and DB MonoDB) or should we use any other repository?

Thanks

~S

 

sameryadav
Level 4
Level 4

1. Are there any batch size recommendation? I means what should be the batch size around?

2.  As per the multiple files upload articles it seems that we need to create the xml file first with list of all assets name then we can pass them to the OSGI service (this is little extra effort at dev level because client have all assets in some other hotfolders or repository). Instead of this are there any options to create the component and pick up the source folder name , destination folder name via dialog or via properties file and pass it to one servlet (that is deployed on AEM) and via this servlet we can call to the other servlet that will upload the assets.

any suggestion?

Opkar_Gill
Correct answer by
Employee
Employee

Hi Samer,

the reason for using MongoMK should be based on your scalability/HA requirements[0]. 

By default, you would use TarMK, and the assets would be included in the repository files. However, for large DAM installations we recommend using an external data store which can be done with TarMK or MongoMK.[1]

Either way, you should batch your uploads to reduce the strain on the system, consider doing this during this when user activity is at it's lowest. Another option is to stop the DAM Upload Asset workflow, during uploads and have a script that is run after the uploads. Have a look at the comments section of [2]

You can always write your own java application that uploads content to AEM calling the slingpostservlet passing in the correct parameters location, properties etc

Whatever you do, you must test out your full upload on a test system, to determine the best approach, e.g. batch size, and any optimizations that should be configured such as [3]

Regards,

Opkar

[0] https://docs.adobe.com/docs/en/aem/6-1/deploy/recommended-deploys.html

[1]https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html#Data Store Configurations

[2]https://edivad.wordpress.com/2013/01/07/installing-big-packages-in-cq5crx/

[3] http://cq-ops.tumblr.com/post/122108616399/the-importance-of-javaiotmpdir-in-dam-asset

View solution in original post