Getting rid of duplicate assets | Community
Skip to main content
Adilos-Cantuerk
Community Advisor
Community Advisor
September 6, 2022
Question

Getting rid of duplicate assets

  • September 6, 2022
  • 2 replies
  • 2183 views

Hello everybody,

i am looking for a way to get rid of all the duplicate assets within our DAM.
Can somebody point me in the right direction where to find some guide how to create a workflow which could do the following:

Step 1: Create a list of all the assets "jcr:content/metadata/dam:sha1" value
Step 2: Compare and find the duplicate

Step3: Create a list of all the duplicates
Check if it is in use - site or InDesign file - if not 
Step 4: Delete

If it is a duplicate - the reference resolver should be able to fix this if, right?
Call one file A and the other B - they are exactly the same
One is in dam/arms and the other in dam/illustration
A is in use on 2 files and B in 3 - could the reference resolver reinstall the connection to A if i delete B?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

2 replies

Adobe Employee
September 8, 2022

Check this - https://experienceleague.adobe.com/docs/experience-manager-64/assets/managing/duplicate-detection.html?lang=en

 

Hope this helps and give some direction.

 

Thanks

Anika

Adilos-Cantuerk
Community Advisor
Community Advisor
September 8, 2022

The duplicate detection is nice, but it functions like googly eyes on a rock for a earthquake detection.
It makes visible what is going on but does only gives out a warning after the fact and does not help to get rid of the mess. It even adds a bunch of notifications that do not really help to clean it up and messily clog up the notifications.
For example:
A user imports 11000 files into a fresh folder. AEM imports it.
THEN Duplicate detection - after the reprocess - sends out 700 duplicate detection notifications to the admin.
each notification informs me that 2 to 40 duplicates where in that upload (5800 in total) and now i need to delete the duplicates, the notifications, and manualy clean up the mess.
Currently i make an metadata export - of the sha1 value and then give those duplicates with the later upload date a unique tag. 
Later i search for that tag and delete all those files manualy.
There has to be a quicker - automated way.
Best of all would be that this duplicate detection would delete duplicates automatically

Level 2
January 22, 2025

Still looking for this. Any update?