Logic behind identifying Duplicate Assets | Community
Skip to main content
mozzvinod
July 2, 2019

Logic behind identifying Duplicate Assets

  • July 2, 2019
  • 1 reply
  • 6861 views

Hi All,

Basis what parameters AEM identify whether assets is duplicate or not. I have been able to analyse till below, but this does not give me exact parameters like if its assets size or Name or Metadata ?

How to enable duplicate check:

Go to the Adobe Experience Manager Web Console Configuration page at the following URL:

http://<server>:<port>/system/console/configMgr

  • Edit the configuration for the servlet Day CQ DAM Create Asset.
  • Select the detect duplicate option, and click/tap Save. The Detect Duplicate feature is now enabled in AEM Assets.

CreateAssetServlet will be invoked on save.

How Sha1 is calculated:

  • ­­­­­Get the asset original rendition using below line of code

Rendition original = asset.getOriginal();

Note: Asset rendition will be unique for each asset.

  • Get Input stream of rendition

is = original.getStream();

  • Gets the sha1 value by passing is input stream to shaHex method of DigestUtils.

sha1 = DigestUtils.shaHex(is);

How duplicate assets are identified:

  • Run a query on dam assets by passing calculated sha1 to get list of duplicate assets.

String queryString = "//element(*, dam:Asset)[(jcr:content/metadata/@dam:sha1 = '" + sha1 + "')]";

  • Iterate through the list returned by above query and try to find the path of asset is equal to path of the asset we are trying to upload. If yes then delete that asset from list.

if (((String) ((List) duplicateAssets).get(i)).equals(asset.getPath())) {

((List) duplicateAssets).remove(i);

                break;

}

your help in this regard would be highly appreciated.

Regards,

Vinod

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

1 reply

mozzvinod
mozzvinodAuthor
July 4, 2019

Thanks for this.

But this does not answer my question on how dam:sha1 value is being achieved and what are the parameters being used to create this, when I upload a file in to AEM DAM.

Regards,

Vinod

mozzvinod
mozzvinodAuthor
July 11, 2019

You have identified the logic behind the functionality (I cannot say if correct or not). But I don't get your question, sorry.

Jörg


Hi Jorg,

My question is very specific.

1. How dam:sha1 being created.

2. What are the parameters being used to create dam:sha1

This will help in determining the parameter used for deciding if asset is duplicate or not.

Regards,

Vinod