We need to bulk publish content (10000s of pages). They are created as a scheduled event so require almost immediate publish for all the pages in one go. Main challenge is how to invalidate content on dispatcher (10000s of pages) and then re-fetch these pages again (on multiple dispatchers).
What is the best strategy to do this ?? Is there a possibility of leveraging network storage for dispatcher cache??
Solved! Go to Solution.
You can issue an HTTP request that causes the dispatcher to delete cached files, and immediately retrieve and recache the file.
Delete and immediately re-cache files when web sites are likely to receive simultaneous client requests for the same page. Immediate recaching ensures that Dispatcher retrieves and caches the page only once, instead of once for each of the simultaneous client requests.
POST /dispatcher/invalidate.cache HTTP/1.1
Also, you can write a flush-cache servlet which can send an invalidate request to Dispatcher and can recache the content. Please take necessary precautions while implementing the flush-cache servlet. Please see the link below for more details:
Hope this helps!
The above linked feature does not requires no additional activities, but that's a feature of the dispatcher. But of course the prefetching needs to happen on each publish/dispatcher instance.
Years back I tried to build a shared dispatcher cache using NFS. I worked for the most obvious cases, but under rare circumstances I got I/O errors on the dispatcher. I did not have the time to investigate it in details, because it should work (the dispatcher does not use unusual system calls or so). Maybe it was a problem of my setup or of one of the affected components (linuxkernel, nfsd, or the lack of the correct configuration).
Things to consider when you switch to such a setup:
* Check how you want to do Blue/Green deployments on publish. In the farming approach (blue and green do not share any system) it's easier to perform a Blue/Green deployment than in the case when you just have a single cache.
* If you have any problem with that NFS share, your sites isn't available anymore.
As @Jörg_Hoh mentioned, you can go with the re-fetching flush agent for this purpose with some modification. If number of pages are 10k then refetching al those 10k pages will flood your publisher with request from dispatcher and again will be a performance issue for you so it will be better to re-fetch only very frequent page out of those 10k like homepage and all and cache all other pages on user request. With the help of java you can use path URL to allow re-fetching.
Hope this will help.