Expand my Community achievements bar.

Support tag based invalidation in dispatcher

Avatar

Level 2

2/8/24

Request for Feature Enhancement (RFE) Summary: Add 3rd invalidation processing mode in AEM dispatcher based not on paths but tags.
Use-case: Currently dispatcher supports either statfile level based or ttl based page invalidation. Those two approaches can't be reliably combined (with dispatcher 4.3.5) and are too inflexible for big multinational websites. AKAMAI has concept of tag based invalidation where each page supplies list of tags when it's cached and those tags can be used during invalidation.
Current/Experienced Behavior: stat file or TTL based page invalidation
Improved/Expected Behavior:

3rd mode where a list of tags (arbitrary ascii strings) is supplied from publisher with a cacheable page in a custom header. The tags are stored and indexed along with cached resources (e.g. in a page.html.tags file) in the dispatcher cache. Then when some resource is invalidated an osgi service installed on publisher is invoked by flush agent to supply a tag value. That tag value is passed on with the invalidation http request to the dispatcher (again in a custom header) which in turn invalidates all files that have this tag stored. As with stat files mode this shall indeed be invalidation not complete removal.

 

Examples how to use tags.

1. Product collection  - all pages pertaining to product collection supply the tag, all pages are thus invalidated together, regardless of the website structure on crx.

2. Path dependencies - components report crx resources they used as tags (e.g. resource path hash), when the resource is invalidated all pages that used it are invalidated too immadiately (e.g. all carousels, menus, footers immediately react to page name change) limiting need for less agile mechanisms like TTL and providing more flexible alternative to stat files.

Environment Details (AEM version/service pack, any other specifics if applicable): Apache dispatcher 4.3.5. Both could and on premise dispatcher could implement this mode.
Customer-name/Organization name:  
Screenshot (if applicable):  
Code package (if applicable):  
3 Comments

Avatar

Level 2

2/8/24

How could tag store be implemented on filesystem level:

1. Apart from main cache content folder like /var/cache/www/sites one would have 2nd tag store folder e.g. /var/cache/www/tags

within that folder we would have subfolders, one per tag e.g.

tags->

       tag1

       tag2

       tag3

Those folders would contains FS hard links to cached resources that were tagged with given tag e.g.

tags->

     tag1->

           f39823-home.html ---hardlink---> /var/cache/www/sites/us/en/home.html

           233283-home.html ---hardlink---> /var/cache/www/sites/cn/zh/home.html

 

When request to invalidate tag1 comes to dispatcher it simply checks which files are in tag1 folder, invalidates them (by e.g. changing the modified date of the file) and removes the hardlink.

 

Avatar

Administrator

2/14/24

@dominik_smogor 

Thanks for proposing this idea
 
This has been reported to the engineering under the internal reference SITES-19858. The product team will triage this request to verify feasibility based on the prioritization model. This post will be updated according to the Jira request status.
Status changed to: Investigating

Avatar

Administrator

2/14/24

@dominik_smogor 

Thanks for proposing this idea
 
This has been reported to the engineering under the internal reference SITES-19859. The product team will triage this request to verify feasibility based on the prioritization model. This post will be updated according to the Jira request status.
Status changed to: Investigating