Expand my Community achievements bar.

Join us in celebrating the outstanding achievement of our AEM Community Member of the Year!

Cache from Dispatcher updated unexpected

Avatar

Level 2

Hi guys, 

we are synchronizing the content between different version of AEM. Here're the steps I did: 

1. In author, build a package in older version, and download. 

2. upload and install the package in the newest version of author. 

3. since the content are published before, we replicated this package to allow the publisher has the content. 

premise: we copied the caches in the dispatcher from the older version to the dispatcher of the new version as a backup.

 

The problem here is, in my opinion, I believe that by using the package manager(not proper workflow process) to synchronize the content on author and publish, the dispatcher won't know that the content on the publish has updated, therefore, I thought that the cache will remain until I start a new workflow to let the dispatcher know that the cache requires to get a new file. Therefore, we originally planned to listed all the page files we have modified and clean the cache of those files.

 

However, when we requested the dispatcher ip along with the content path to get one of the page(in the updated package I just mentioned), the dispatcher log showed that the cache is updated!?

please see the log result below. 

 
`cache file is older than lastflush -> flush [content/the requested page path.html]`  
`cache-action for [content/the requested page path.html]: CREATE` 
 
Confusing Points:
1. what's the "lastflush" time definition in dispatcher?
2. when does "lastflush" be stored?
3. how does the dispatcher know the the cache is too old so it has to get the new one?
 
the result really confused me. Will be really appreciated if someone know the reasons. 
 
Thanks!!
 
 
Topics

Topics help categorize Community content and increase your ability to discover relevant content.

8 Replies

Avatar

Community Advisor

Hi @annieyang87957

When you replicated the content from newer version of author to publish, the replication agent on new author sends a cache invalidation request to Dispatcher when a page is published. The request causes Dispatcher to eventually refresh the file in the cache as new content is published.
Hence, the cache is expected to be updated if the replication agent is involved in pushing content. 

However, in case of package manager's replication it won't clear the cache.

Avatar

Level 2

Hi @Rohan_Garg 

Thanks for the reply, let me clear things up a little bit.

We used the "Package Manager" to build a list of pages, then in the author, we use the replicate action in Package Manager. (We are not using the regular publish process.)

We wondered that if we used the Package Manager replicate action, this action will trigger the dispatcher flush agent in Publisher as well?

We previously thought that by using the replicate action in Package Manager, the dispatcher flush won'y be triggered.

annieyang87957_0-1684320113092.png

 

Avatar

Community Advisor

Hi @annieyang87957,

We had actually tailored the package replication to invoke invalidation requests.

When you activate packages, the publish does not send invalidation requests for the content of the package. That's default behavior.

 

Coming back to your query then-

premise: we copied the caches in the dispatcher from the older version to the dispatcher of the new version as a backup.

When the dispatcher cache is copied to the new version, is the timestamp for .stat files modified to the time of copying?

Avatar

Level 2

Hi @Rohan_Garg 

May I ask the .stat file is only one file or every pages(files) have different .stat files? 
Since we have checked one specific page in the dispatcher, we can see the “cache” created time still in the past. The stat file timestamp you mentioned is different from the cache created time I mentioned right?

 

we also found that if publishing process is activated(not for specific page), the log will show that it will touch the .stat file every time. Is it possible that every time the file is touched, the timestamp will be modified to the present time?

Then it confused me more, (please correct me if I am wrong)

the cache mechanism seems to be useless, because the old cache created time are all in the past (since we copied from older version of AEM) and the new request comes in, the dispatcher will think that the cache is outdated(compared with the stat file updated time) and ask the publish for files.

Avatar

Community Advisor

1. The .stat file is only one file or every pages(files) have different .stat files? 
Every folder has its own stat file. Dispatcher creates .statfiles in each folder from the docroot folder to the level that you specify.

2. The stat file timestamp you mentioned is different from the cache created time?
The stat file timestamp is not different from the cache created time.

When content is updated, Dispatcher updates the timestamp of the stat file.
Files are invalidated by touching the .stat file. The .stat file’s last modification date is compared to the last modification date of a cached document. The document is re-fetched if the .stat file is newer.

3. Is it possible that every time the file is touched, the timestamp will be modified to the present time?
Yes, that is how it works. By default, when a statfile is touched and invalidates cached content, Dispatcher deletes the cached content the next time it is requested.

 

To summarize, your stat file modification time might be newer than the modification time of the resource/page & hence the dispatcher might consider the resource as obsolete.

Can you share a screenshot of the cache create time still being from the past or the stat file timestamp & the resource/page timestamp?

Reference for stat file documentation- 
https://experienceleague.adobe.com/docs/experience-manager-dispatcher/using/configuring/dispatcher-c...

Avatar

Level 2

Hi @Rohan_Garg , 

Thanks for all your reply!!! It's clear and understandable.
Let me provide the cache create time still being from the past or the stat file timestamp & the resource/page timestamp tomorrow in the office.

 

But first, let me use a example to clarify if I understand the mechanism correctly. 

1. there are five pages under the index.html node, such as 

- index/a.html

- index/b.html

- index/c.html

- index/d.html

five of them were replicated in the old AEM version(6.3) a month before, and therefore the dispatcher owned the cache of these files. 

As I mentioned, I copy these cache files to the New Version of AEM (6.5). 

 

2. Later, I adjust a.html and b.html files only in 6.3 version, and use the package manager to pack these 2 files, download the package and upload it to the 6.5 author. 

 

3. In 6.5 version, I replicated this package and this action touched the stat file. Then later, if I use the dispatcher IP to request these 2 paths, the cache will know that these 2 path is updated and ask for new ones. In addition, if I use the dispatcher IP to request c.html and d.html files, the cache(from one month ago) will still remain since these 2 files are not in the installed package. 

 

Please correct me if I am wrong. Thank you so much!!

Avatar

Community Advisor

Bingo! That's what I think is happening behind the scenes!

Let's take a look at the stat file time stamp and your page time stamp to check if the reasoning sounds plausible!

Avatar

Employee Advisor

Hi,

 

  1. The "lastflush" time in the dispatcher refers to the timestamp of the last cache clearing or invalidation.
  2. The "lastflush" time is stored internally by the dispatcher, and it can be based on various factors such as explicit cache invalidation requests or time-based cache expiration.
  3. The dispatcher determines if the cache is outdated by comparing the "lastflush" time with the timestamp of the requested content. If the cache's "lastflush" time is older than the content, the dispatcher recognizes that the cache needs to be updated.

Please review dispatcher configurations, and rules to ensure proper cache invalidation.