How can I query to see which DAM assets in a given /content/dam path are being used most (to least) in a given /content/site path?
Solved! Go to Solution.
Views
Replies
Total Likes
you can generate a report of the published assets which should give you list of used assets.
http://localhost:4502/mnt/overlay/dam/gui/content/reports/reportlist.html
Views
Replies
Total Likes
you can generate a report of the published assets which should give you list of used assets.
http://localhost:4502/mnt/overlay/dam/gui/content/reports/reportlist.html
Views
Replies
Total Likes
Thanks for the suggestion Hemant. I posed the question in more detail to Adobe Support:
Our DAM has grown large and we suspect many assets are unused and could be deleted. We're using Classic UI and can select Tools, References... in the DAM Admin to see the references to individual assets, but we'd like to check the whole DAM and find assets with 0 references, or 1, or 2, or an arbitrary number of references.
Can you assist with a method or query for reporting on asset use?
I have tried using the following:
http://www.wemblog.com/2012/12/how-to-remove-non-referenced-node-from.html
https://github.com/hashimkhan786/aem-groovy-scripts/blob/master/findUnusedAssets.groovy
but each incorrectly reports assets as unreferenced that are in fact referenced.
I would like to use a similar technique as the UI (Tools / References...) in Classic UI / DAM Admin, but for the whole DAM.
I'm not a programmer so a complete example / query would be appreciated.
I received the following answer from Adobe Support:
Please note that there is no out of box feature for this task and this should be implemented as a custom piece of code. The challenge here is the fact that we can only search the asset references on a page but it is not possible to search within a entire DAM to check if the asset is referenced by any page or not. The reason is that there is no property on the Assets which indicates any references to a page or any other asset. However the assets referenced on page could be determined by the property of the components node. This property could be queried to find out the references. But as you can imagine such query will only show asset reference in a page not the entire DAM.
So to find the asset references in DAM a piece of code can be written to do this.
Here is a "sample" Pseudo code
- set the search path to DAM root path
- iterate through each node using "NodeIterator"
- search property to find the references using "ReferenceSearch"
- Store results in java "Map" object
- return nodes/assets not referenced anywhere
- Recursively iterate through all the assets from root path
Please note that this task will be highly technical that would require skills in Java coding as there is no single query that would give us the results we are looking for.
So a better option for you might be engaging Adobe Consulting Services.
I don't code in Java, but Adobe Support's suggestions got me thinking and I came up with a roundabout solution using querybuilder, the references.json built in, and command line tools to get 2 files, one with unreferenced assets, and another with assets and their references:
# get list of assets, where /content/dam/site/folder is the folder you want to query
curl -s -u "admin:admin" "http://localhost:4502/bin/querybuilder.json?path=/content/dam/site/folder&type=dam:Asset&p.limit=-1" > assets-in-path.json
# clean up assets-in-path.json with jq, a command line json parser. install jq via brew install jq (or similar)
# jq is the Swiss army knife of json parsing.
cat assets-in-path.json | jq '.hits[].path'|sed 's/^"//g'|sed 's/"$//g' > clean-assets-in-path.txt
# create php file to urencode asset paths
sed 's/^/echo urlencode("/g' clean-assets-in-path.txt |sed 's/$/"),PHP_EOL;/g' > urlencode-assets-in-path.php
# add opening and closing php script tags to top and bottom of the urlencode-assets-in-path.php file, at top: <?php
# and at bottom: ?>
# run the php script
php urlencode-assets-in-path.php > urlencoded-assests-in-path.txt
# add utf-8 encode query parameter (to the curl commands that will be constructed by below script) as the assets may contain utf-8 hex encoded strings,
# this query parameter is necessary to get accurate results back from AEM - without it, the asset will be shown to have no references when in actuality, it may have references
sed 's/$/\&_charset_=utf-8/g' urlencoded-assests-in-path.txt > dam_assets.txt
# the following is a bash script that executes curl commands against the dam_assets.txt file above - save bash script as get-unreferenced.sh (chmod +x get-unreferenced.sh) and execute as ./get-unreferenced.sh dam_assets.txt
# the generated assets.txt file is overwritten for each curl command result
# when the result is empty, the unreferenced asset is written to unused_dam_assets.txt
- - - - - - - - - - -
#!/bin/bash
while IFS= read -r line; do
dam_asset=$line
curl -s -u "admin:admin" "http://localhost:4502/bin/wcm/references.json?path=$dam_asset" > assets.txt
grep -q '\[\]' assets.txt
if [ $? -eq 0 ]
then
echo "$dam_asset" >> unused_dam_assets.txt
#exit 0
fi
done < "$1"
- - - - - - - - - - -
For referenced assets, I used the following, as above, get-used.sh dam_assets.txt:
- - - - - - - - - - -
#!/bin/bash
while IFS= read -r line; do
dam_asset=$line
curl -s -u "admin:admin" "http://localhost:4502/bin/wcm/references.json?path=$dam_asset" >> used.txt
done < "$1"
- - - - - - - - - - -
Command to print assets and references as csv:
jq --raw-output '.pages | group_by(.srcPath)[] | [.[0].srcPath, .[0].published, .[].references[]] | @csv' used.txt
# Useful commands
From stack exchange (I needed to install and use ggrep to do on Mac OS X):
- - - - - - - - - - -
# grep for non-ascii characters
grep --color='auto' -P -n "[\x80-\xFF]" file.xml
This will give you the line number, and will highlight non-ascii chars in red.
In some systems, depending on your settings, the above will not work, so you can grep by the inverse
grep --color='auto' -P -n "[^\x00-\x7F]" file.xml
- - - - - - - - - - -
# you may want to urldecode the unreferenced assets to view them more easily, etc. I needed to install and use gsed to do on Mac OS X:
/usr/local/bin/gsed 's/^/echo urldecode("/g' unused_dam_assets.txt |gsed 's/$/"),PHP_EOL;/g' > urldecode-unused.php
If you have issues with an older version of jq, sometimes upgrading it can resolve.
If you use the results to delete unused assets via cURL, you may need to add "sleep 1" (or some other number) commands between each cURL command if the script executes too fast for the system to handle the deletes.
FYI - I only saw the following reports in Classic UI (under Tools) -- we're still using Classic UI:
/reports/healthcheck.html
/reports/auditreport.html
/reports/compreport.html
/reports/diskusage.html
/reports/userreport.html
/reports/wfinstances.html
/etc/reports/ugcreport.html
nothing for DAM asset use.
I couldn't find any reports in Touch UI (under Tools), and the suggested URL: /mnt/overlay/dam/gui/content/reports/reportlist.html was a 404 on my system.
We're running AEM 6.3.2.1.
Views
Replies
Total Likes
Views
Likes
Replies