Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.

Handling deleted tags in AEM

Avatar

Level 2

Hi everyone,

Just seeking some advice on best practices around handling deleted tags. As far as I understand...

Version: 6.3, SP1

  1. Deleting a tag whilst it's referenced by a number of content nodes (pages) will remove the tag from /etc/tags but it won't remove it from the page's cq:tags property.
  2. The com.day.cq.tagging.impl.TagGarbageCollector background job will clean-up tags that are no longer referenced by any pages.
  3. The cq:tags property of the pages will still contain references to the deleted tag until they are removed manually.
  4. The deleted tag shows up in the Touch UI in page properties section.

In view of the above, my questions are as follows:

  1. What is the expected behaviour of the TagManager API? What will it return the pages that still contain references to the deleted tags?
  2. Searching for the deleted tag in the Touch UI returns the pages that were previously tagged with it. Is that expected behaviour?
  3. Is my assumption correct that automatically removing the deleted tags (from the cq:tags property) from the referenced pages is a bad practice especially in scenarios where there are large number of pages that were tagged with it (high read/write - potentially killing off the author instance)?
  4. Is my assumption correct that deleting tags for the above reasons should be done seldom and that this type of activity should not be performed by regular authors?
  5. What would be the recommended way to remove references to deleted tags from a large number of pages? Is the answer figure out the taxonomy to start with and don't mess with it?

Look forward to a good discussion on this topic...

Thanks,

Arup

11 Replies

Avatar

Level 10

For point 1 - did you try that - apply a tag to page, delete the tag and then use the API to see if that page is still returned. Try invoking the find method and see if that resource is returned.

Avatar

Level 10

Also - i advice you to watch this session to learn more about assets and tags - Explore AEM Assets and Tags by their APIs

Avatar

Level 2

Sadly the session doesn't really address the crux of the questions I have...

Avatar

Community Advisor

Hi,

With the help of Query yo can find the tags which are referred in the page and using TagManager API com.day.cq.tagging.TagManager  you can delete the tags from page and after that tags can be deleted from /etc/tags/



Arun Patidar

Avatar

Level 8

We should not give access to content authors to delete tags and it is good practice to have super users group who will have access to delete tags. so that we can avoid human errors here

Yes, when you delete tag the reference will not delete automatically, you need to wait for Garbage collection to run and clean up all references, and the GC runs every midnight, so you need to wait until garbage collector runs. If you don't want to wait then you must change the configurations and set it to 5 or 10mins, but this is not recommended as per my experience we will not delete tags very frequently so unnecessary we are putting the burden on the server.

To clear the references once the tag is deleted, you can manually go and change the configurations by setting the value to 5mins once the job is completed then revert it back to the original value

if you don't have an option to manually change server configurations then you can write a simple script which finds references of deleted tags using TagManager API and delete, you can write a servlet for this or add some option in the Admin dashboard.

Avatar

Level 2

I changed the configuration of the garbage collector on my local instance to run every 5 minute with the following expression

0/1 0/5 0 ? * * *

I can confirm that the TagGarbageCollector does not remove the references of the deleted tags from cq:tags property of the page's jcr:content node. Basically this is going to be a problem as searching for the deleted tag in UI shows up pages that contain this invalid reference now.

Avatar

Level 2

I can confirm too that Tag Garbage collector does not remove or fix references to the tag on pages. The only think it does is to remove old location of Tag after it was moved. It simply checks if tag from old location is still referenced on the pages, if not it removes that tag - that means the OOTB approach for this is that author is responsible to re-tag all the affected pages (remove tag that was moved/renamed and tag using that tag once again - but this time the new location of tag will be stored on the page level).

Avatar

Level 8

Thanks, Guys, I think my understanding was wrong on it, if that is the case then we must write custom logic. which Query all tags which are referenced on the page or component, then iterate collection and find out the deleted tag and finally remove from the page or component node.

Avatar

Level 2

Doing this in practical terms is untenable from a performance point of view. Imagine one tag that references 50K pages. You do this and you will bring down your author instance or whatever instance you do this on. Then there is the question of what will you do with vast number of pages that have now been modified on Author? Publish them? What are the consequences of that (surely you won't run a massive query and edit pages on publish). The list goes on.

The only realistic option I can think of is to do the upfront work and figure out your taxonomy. Don't let authors delete tags and as per documentation, move or rename tags. For those things it has to be an admin activity, the entire thing has to be planned and executed. In other words, deleting tags is not a BAU author activity.

I'd love to hear from Adobe staff if there is an alternative view!

Avatar

Level 8

As I mentioned earlier we must have proper governance for handling tags, the approach which I mentioned earlier is one way handling the deleted tags but we can do the same thing in multiple ways using observers also, but I feel you are expecting to hear solution from Adobe staff so Adobe people may help you on this.

Avatar

Level 2

Does anyone know if there is a way to have tag garbage collection work not only on cq:tags properties, but also custom page and component properties that we have created that can have tag values?