Expand my Community achievements bar.

SOLVED

Caching on AEM Cloud publish adobeaemcloud.com domain

Avatar

Level 4

Hey guys,

 

I am having an issue when I try to access the publish domain https://publish-p00001-e000002.adobeaemcloud.com/content/appName/en/shopping-cart.html

The changes pushed from the author are only visible in publish domain URL when I try to access through the query parameters.

 

 

Checking the logs I see the below -

  1. aemdispatcher logs - 
    [27/Nov/2024:07:06:44 +0000] [I] [cm-p00001-e00002-aem-publish-749967f856-nbzdf] "GET /content/appName/en/shopping-cart.html" - 11ms [publishfarm-marks/-] [actionhit] publish-p00001-e00002.adobeaemcloud.com
    The request with no query params is being served from dispatcher even after having the cache cleared.
    [27/Nov/2024:07:07:15 +0000] [I] [cm-p00001-e00002-e00002-publish-749967f856-nbzdf] "GET /content/appName/en/shopping-cart.html?nocache=trueCL" 200 326ms [publishfarm-marks/0] [actionnone] publish-p00001-e00002.adobeaemcloud.com

  2. httpd error logs -
    Wed Nov 27 07:06:44.810984 2024 [dispatcher:debug] [pid 627:tid 764] [cm-p00001-e00002-aem-publish-749967f856-nbzdf] checking [/content/appName/en/shopping-cart.html]
    Wed Nov 27 07:06:44.811002 2024 [dispatcher:debug] [pid 627:tid 764] [cm-p00001-e00002-aem-publish-749967f856-nbzdf] never flushed [/mnt/var/www/html/content/appName/.stat] -> use cache [/mnt/var/www/html/content/appName/en/shopping-cart.html]
    Wed Nov 27 07:06:44.811007 2024 [dispatcher:debug] [pid 627:tid 764] [cm-p00001-e00002-aem-publish-749967f856-nbzdf] cache-action for [/content/appName/en/shopping-cart.html]: DELIVER
    Wed Nov 27 07:06:44.811009 2024 [dispatcher:debug] [pid 627:tid 764] [cm-p00001-e00002-aem-publish-749967f856-nbzdf] request declined
    Wed Nov 27 07:06:44.811058 2024 [dispatcher:debug] [pid 627:tid 764] [cm-p00001-e00002-aem-publish-749967f856-nbzdf] response.headers[Content-Type] = "text/html;charset=utf-8"
    Wed Nov 27 07:06:44.811064 2024 [dispatcher:debug] [pid 627:tid 764] [cm-p00001-e00002-aem-publish-749967f856-nbzdf] response.headers[X-Content-Type-Options] = "nosniff"

Why would the logs show the /content/appName/.stat as never flushed when we flushed a child page underneath it?

Is it because of the statfiles level? Its currently set to "2". Do we need to increase it to 3 or 4? How to determine that based on the current node hierarchy?

/statfileslevel "2"

 

 

 

 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

@NageshRaja - I have one observation based on the host files you sent over and the default one you mentioned above- 

You don't have any ServerAlias set to localhost. Even in AEMaaCS the publisher will clear the cache by calling /dispatcher/invalidate.cache API with localhost (As both dispatcher and publisher share the same runtime in Kubernetes container)

I would suggest you create a new host and make sure it only has the ServerAlias set to localhost. You don't need any rewrites in this. Just a bare minimum skeleton of the vhost should do.

If you check the wknd codebase for instance you will notice that both default.vhost and wknd.vhost have ServerAlias set to "*" thereby allowing the cache invalidation.
In your codebase there is no * set to ServerAlias which is understandable but none set to "localhost" too thereby causing no vhost to serve the invalidation request from publisher to dispatcher.
Can you please try this and let me know if this works for you?

If not, then please share your exact dispatcher configuration for one tenant at least.

 

Best Regards,
Rohan Garg

View solution in original post

14 Replies

Avatar

Level 9

Hi @NageshRaja,

There is definietly a problem with your Dispatcher cache not being flushed. Can you confirm that you defined the statfilelevel in the invalidation.farm? As that is the farm used for Dispatcher flush/invalidation requests.

The statfilelevel starts from your DOCROOT, so it looks fine. In the aemdispatcher logs, you should see a HIT or MISS for each request you make. That is how you know if the HTML is served from Dispatcher cache or not.

I suggest you look at the logs of your cache invalidation requests. Also, I suggest you debug the issue on a local Dispatcher instance.

 

Good luck,

Daniel

Avatar

Level 4

hey @daniel-strmecki - I do see hit and miss requests - the requests with query params are reported as [actionhit] while those without are reported as [actionmiss]

I am getting validator issues in my current dispatcher folder - mostly symlinks are not present at enabled vhosts - they are an exact replica of the available vhosts.

Avatar

Level 9

Hi @NageshRaja,

it should be vice versa, requests with query params should be a MISS. Here is a shell script for you to fix symlinks:

cd ..
project_root=$(pwd)
cd dispatcher/src/conf.d/enabled_vhosts
enabled_hosts=$(ls *.vhost)
for host in $enabled_hosts; do
	rm $host
	ln -s ../available_vhosts/$host $host
done

cd $project_root
cd dispatcher/src/conf.dispatcher.d/enabled_farms
enabled_farms=$(ls *.farm)
for farm in $enabled_farms; do
	rm $farm
	ln -s ../available_farms/$farm $farm
done

 

Hope this helps,

Daniel

Avatar

Community Advisor

@NageshRaja 

 

Symlinks are not needed which flexible Mode of AEM Dispatcher. Details available on: https://experienceleague.adobe.com/en/docs/experience-manager-cloud-service/content/implementing/con... 

 

Please verify, if opt-in/USE_SOURCES_DIRECTLY is configure in dispatcher. If yes, symlinks are not needed.

-------

 

For cached URL with query params, please review if ignoreUrlParams  is configured. https://experienceleague.adobe.com/en/docs/experience-manager-dispatcher/using/configuring/dispatche... 


Aanchal Sikka

Avatar

Community Advisor

@NageshRaja 

 

Please enable debug logs on the dispatcher. Publish the child page and validate dispatcher logs. The details on how the logs should look like are available on link

For reference, copying the relevant logs here as well:

aanchalsikka_2-1732696117230.png

 

 

With statfilelevel 2, the .stat file would have been created on:

docroot

/content

/content/appName

 

And any publish on Child-page should touch all .stat files along this path. For better understanding, refer to link

 


Aanchal Sikka

Avatar

Level 4

thanks for replying @aanchal-sikka , however referring to your screenshot I see the logs "Touched /mnt/var.." logged as Info and not Debug.
We have already defined the log level as debug and still not getting the logs you mentioned -

 

Define DISP_LOG_LEVEL debug

Avatar

Community Advisor

Hey @NageshRaja,
The stat file level is set adequately. Can you share the vhost config and cache rules?

Avatar

Level 4

default aem_publish.vhost - there are other brand specific vhosts as well
Sent that to you in inbox masking the brands - please check

# Collect any enviromental variables that are set in /etc/sysconfig/httpd
# Collect the dispatchers number
PassEnv DISP_ID
<VirtualHost *:80>
ServerName publish
# Put names of which domains are used for your published site/content here
ServerAlias ${PUBLISH_DEFAULT_HOSTNAME}
# Use a doc root that matches what's in the /etc/httpd/conf/publish-farm.any
DocumentRoot ${DOCROOT}
# Add header breadcrumbs for help in troubleshooting
<IfModule mod_headers.c>
Header always add X-Vhost "publish"
Header merge X-Frame-Options SAMEORIGIN "expr=%{resp:X-Frame-Options}!='SAMEORIGIN'"
Header merge X-Content-Type-Options nosniff "expr=%{resp:X-Content-Type-Options}!='nosniff'"
# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary
</IfModule>
<Directory />
# Update /etc/sysconfig/httpd with setting the PUBLISH_WHITELIST_ENABLED from 0 or 1 to enable or disable ip restriction rules
<IfModule disp_apache2.c>
# Some items cache with the wrong mime type
# Use this option to use the name to auto-detect mime types when cached improperly
ModMimeUsePathInfo On
# Use this option to avoid cache poisioning
# Sling will return /content/image.jpg as well as /content/image.jpg/ but apache can't search /content/image.jpg/ as a file
# Apache will treat that like a directory. This assures the last slash is never stored in cache
DirectorySlash Off
# Enable the dispatcher file handler for apache to fetch files from AEM
SetHandler dispatcher-handler
</IfModule>
Options FollowSymLinks
AllowOverride None
# Insert filter
SetOutputFilter DEFLATE
# Don't compress images
SetEnvIfNoCase Request_URI \
\.(?:gif|jpe?g|png)$ no-gzip dont-vary
</Directory>
<Directory "${DOCROOT}">
AllowOverride None
Require all granted
</Directory>
<IfModule disp_apache2.c>
# Enabled to allow rewrites to take affect and not be ignored by the dispatcher module
DispatcherUseProcessedURL 1
# Default setting to allow all errors to come from the aem instance
DispatcherPassError 0
</IfModule>
<IfModule mod_rewrite.c>
ReWriteEngine on
# LogLevel warn rewrite:info
# Global rewrite include
# Update /etc/sysconfig/httpd with setting the PUBLISH_FORCE_SSL from 0 or 1 to enable or disable enforcing SSL
<If "${PUBLISH_FORCE_SSL} == 1">
</If>
</IfModule>
</VirtualHost>

Avatar

Level 9

@NageshRajadid you confirm that you are setting the statfilelevel in the invalidation.farm, not your site-specific farm file?

 

Good luck,

Daniel

Avatar

Level 4

hey @daniel-strmecki, its set on all the farm files for individual tenant as well as default.farm

here's the default.farm below -

# (feature supported since dispatcher build 2.6.3.5222)
/clientheaders {
$include "../clientheaders/clientheaders.any"
}
# hostname globbing for farm selection (virtual domain addressing)
/virtualhosts {
$include "../virtualhosts/virtualhosts.any"
}
# the load will be balanced among these render instances
/renders {
$include "../renders/default_renders.any"
}
# only handle the requests in the following acl. default is 'none'
# the glob pattern is matched against the first request line
/filter {
$include "../filters/filters.any"
}
# if the package is installed on publishers to generate a list of all content with a vanityurl attached
# this section will auto-allow the items to bypass the normal dispatcher filters
# Reference: https://docs.adobe.com/docs/en/dispatcher/disp-config.html#Enabling%20Access%20to%20Vanity%20URLs%20-%20/vanity_urls
# /vanity_urls {
# /url "/libs/granite/dispatcher/content/vanityUrls.html"
# /file "/tmp/vanity_urls"
# /delay 300
# /loadOnStartup 1
# }
# allow propagation of replication posts (should seldomly be used)
/propagateSyndPost "0"
# the cache is used to store requests from the renders for faster delivery
# for a second time.
/cache {
# The cacheroot must be equal to the document root of the webserver
/docroot "${DOCROOT}"
# sets the level upto which files named ".stat" will be created in the
# document root of the webserver. when an activation request for some
# handle is received, only files within the same subtree are affected
# by the invalidation.
/statfileslevel "2"
# caches also authorized data
/allowAuthorized "0"

Avatar

Level 9

Hi @NageshRaja,

the statfileslevel setting is only important for the farm you use for invalidation. Please try to publish a page and check in the Dispatcher logs which farm it hits. In my case, on AEMaaCS, this is the invalidation.farm that is used only for invalidation and is not mapped to any virtual hosts.

 

Good luck,

Daniel

Avatar

Correct answer by
Community Advisor

@NageshRaja - I have one observation based on the host files you sent over and the default one you mentioned above- 

You don't have any ServerAlias set to localhost. Even in AEMaaCS the publisher will clear the cache by calling /dispatcher/invalidate.cache API with localhost (As both dispatcher and publisher share the same runtime in Kubernetes container)

I would suggest you create a new host and make sure it only has the ServerAlias set to localhost. You don't need any rewrites in this. Just a bare minimum skeleton of the vhost should do.

If you check the wknd codebase for instance you will notice that both default.vhost and wknd.vhost have ServerAlias set to "*" thereby allowing the cache invalidation.
In your codebase there is no * set to ServerAlias which is understandable but none set to "localhost" too thereby causing no vhost to serve the invalidation request from publisher to dispatcher.
Can you please try this and let me know if this works for you?

If not, then please share your exact dispatcher configuration for one tenant at least.

 

Best Regards,
Rohan Garg

Avatar

Level 4

Kudos @Rohan_Garg! this worked brilliantly but how come the adobe documentation doesn't specify this anywhere?

Avatar

Community Advisor

If you check the aemdispatcher logs for any cloud env, you will see the below - 

aio cloudmanager:tail-log --programId abcde 0000001 dispatcher aemdispatcher

[17/Dec/2024:11:50:54 +0000] [I] [cm-abcde-0000001-aem-publish-9c97fb757-pt2qc] "POST /dispatcher/invalidate.cache" 200 0ms [publishfarm/-] [actionpurge] localhost
[17/Dec/2024:11:51:54 +0000] [I] [cm-abcde-0000001-aem-publish-9c97fb757-pt2qc] "POST /dispatcher/invalidate.cache" 200 0ms [publishfarm/-] [actionpurge] localhost