Issues: Bulk Image Upload to AEM DAM ->8000 Assets
All,
I am facing issues with uploading assets to AEM via a bundle. I have assets close to 9000 and i notice issues such as
1. Disk space decreasing , around 9-10 gb and on tar opt i see around 5 gb released.
Question: What is the potential cause for this disk space increase?
Approach taken: I am invoking the save, for every 20-22 assets, and i have tried for 10-15 as well. Along with that the code attempts to check if the workflow for the asset is complete, and if the rendition, 48x48 is generated, after creating the asset and setting metadata values and tags, if the workflow status is true or the rendition has not been generated, i invoke thread.sleep and wait for it to complete, the process for waiting for the completion of workflow and rendition is done for 15 assets in a batch .Once complete , save operation is invoked if image counter is 20 or 15.
Used asset.setBatch but in vain.
Question : Is there any caching of inputstream in AEM?
Approach Taken: I open the inputstream and close it after streaming the assets, and uploading them using assetmanager.createAsset method.
Note: Waiting for workflow completion status and rendition for a set of assets avoids eating much of disk space, i noticed that it still ate, 7.5gb.
2.Batch wait and save Approach: I noticed that the disk space increase could somehow be averted by a few gigs, by having to wait for workflow completion status and rendition for a set of assets, 15 in particular is what i tried.
With this however I noticed the below mentioned issues
Issue1: I see exceptions at times, however the workflow status is complete, thus workflow status does not help me identify which asset processing is incomplete.
- com.day.cq.dam.commons.handler.StandardImageHandler failed to extract image using Layer will try the fallback java.util.ConcurrentModificationException
- Image Read Exception : Invalid marker found in entropy data
- "failure to create asset rendition"
Issue2: The instance just stops further parsing when it reaches a limit, and displays caching statements and Tar journal thread lock message. Instance is unresponsive.
Message Observed in log:
Issue3: Is there a way that we could capture that the errors in (Issue1) and avoid processing that. I hoped Workflow would have failed, but does not look like.
3. I would like to disable workflow processing before uploading assets, and restart it after upload is complete. and i hear many mention it helps,
Question: does this still create necessary renditions and versions?
Question: Which is the workflow or workflows that i am supposed to stop temporarily?
I feel this would help avoiding many issues with Asset upload dealing with approx 9000 assets.
Regards,