Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.

Dispatcher Redirection and Error Handling

Avatar

Community Advisor

Hi Team,

 

I have the below scenario - 

In my rewrite rules I have written the below sequence - 

1. 

RewriteRule ^/$ /content/xyz/us/en/home-page.html [R=301,L]

As seen the above redirect points the root request to home-page.html

 

2. 

RewriteRule ^/content/xyz/us/en/(.*)$ /$1 [NE,L,R=301]

 As seen here we are shortening the URL to eliminate /content/xyz/us/en from the request URL

 

3. 

RewriteRule ^/(.*)$ /content/xyz/us/en/$1 [NC,PT,L]

This rewrite is being written to append the URL back again and forward to publish

The above redirects are working fine for URL shortening on browser and then forwarding the complete path to publish.

 

4. The Vhost file contains the following -

<IfModule disp_apache2.c>
# Enabled to allow rewrites to take affect and not be ignored by the dispatcher module
DispatcherUseProcessedURL On
# Default setting (0) to allow all errors to come from the aem instance
DispatcherPassError 1
</IfModule>

ErrorDocument 404 /content/xyz/us/en/errors/404.html
ErrorDocument 500 /content/xyz/us/en/errors/500.html
ErrorDocument 502 /content/xyz/us/en/errors/500.html

 

The problem is that for incorrect URLs the ErrorDocument is not being picked up - We are directly spooling the publish's error page on dispatcher.

Please find the logs for the same - 

172.17.0.1 "localhost" - [05/Aug/2022:15:12:21 +0000] "GET /content/xyz/us/en/abc.html HTTP/1.1" 301 233 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
172.17.0.1 "localhost" - [05/Aug/2022:15:12:21 +0000] "GET /abc.html HTTP/1.1" 404 2400 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
[05/Aug/2022:15:12:21 +0000] "GET /content/xyz/us/en/abc.html HTTP/1.1" 404 none [publishfarm/0] 9ms "localhost"

 

As seen the Dispatcher is not showing the 404 page from ErrorDocument.

rohangargTA_0-1659712444625.png

 

Can you please let me know where the issue is ?
@arunpatidar , @kautuk_sahni - Any quick help on this ?

 

11 Replies

Avatar

Level 3

hi @Rohan_Garg 

please try the below steps and see if this helps
Step 1: Try commenting the custom rewrites file path in your vhost -> rerun and give a complete url which is invalid (/content/xyz/us/en/abc.html) and pls check If this page takes you to the /errors/404.html

Step 2: change the error document url to shortened url : ErrorDocument 404 /errors/404.html and from your rewrites remove your 2nd rewrite as below

RewriteRule ^/$ /content/xyz/us/en/home-page.html [PT,L]
RewriteRule ^/content/xyz/us/en/(.*)$ /$1 [NE,L,R=301] -> Remove this rewrite
RewriteRule ^/(.*)$ /content/xyz/us/en/$1 [NC,PT,L]

now try invalid short url -> /abc.html and check if this takes you to the error page

Avatar

Community Advisor

@achennapragada - Thanks for your response - I just cleared all my redirects and tried 

If I erase all my rewrite rules and run there are 2 use cases - 

If the requests originates with /content and is .html based then publish's error page gets thrown.

 

Use Case 1 - For a request originating with /content and is .html based - The publish's error page is being shown - 

172.17.0.1 "localhost" - [06/Aug/2022:08:36:58 +0000] "GET /content/abc.html?nocache=true1212 HTTP/1.1" 404 2424 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
[06/Aug/2022:08:36:58 +0000] "GET /content/abc.html?nocache=true1212 HTTP/1.1" 404 none [publishfarm/0] 17ms "localhost"

 

Use Case 2 - For a request origination with anything but /content or is not .html based- The dispatcher is showing errordocument page.
172.17.0.1 "localhost" - [06/Aug/2022:08:37:12 +0000] "GET /xyz/abc.html?nocache=true12121 HTTP/1.1" 404 2945 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
[06/Aug/2022:08:37:12 +0000] "GET /content/sitename/us/en/errors/404.html HTTP/1.1" - hit [publishfarm/-] 2ms "localhost"

 

Any idea what might be causing the two to behave differently ?

Avatar

Level 3

Hi @Rohan_Garg , please try these rewrites.

 

# rewrites for root redirect
RewriteRule ^/?$ /content/xyz/us/en.html [PT,L]

 

#Redirect for all the shortened pages ending with .html
RewriteCond %{REQUEST_URI} !^/?(etc.clientlibs|resources|libs|content|libs|apps) [NC]
RewriteRule ^/(.*) /content/xyz/us/en/$1 [PT,L,QSA]

Avatar

Community Advisor

@achennapragada - The rewrite rules are fine and validated on cloud as well.

The issue was only on local and that too because the vhost in enabled folders are not symlinks but exact replicas from the available folders.

These redirects on cloud are working fine as intended.

Thanks for your help!

Avatar

Community Advisor

@arunpatidar - Sorry to disturb over this again, hope you can provide some insight into this - 

If I erase all my rewrite rules and run there are 2 use cases - 

If the requests originates with /content and is .html based then publish's error page gets thrown.

 

Use Case 1 - For a request originating with /content and has .html - The publish's error page is being shown - 

172.17.0.1 "localhost" - [06/Aug/2022:08:36:58 +0000] "GET /content/abc.html?nocache=true1212 HTTP/1.1" 404 2424 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
[06/Aug/2022:08:36:58 +0000] "GET /content/abc.html?nocache=true1212 HTTP/1.1" 404 none [publishfarm/0] 17ms "localhost"

 

Use Case 2 - For a request origination with anything but /content and non html - The dispatcher is showing errordocument page.


172.17.0.1 "localhost" - [06/Aug/2022:08:37:12 +0000] "GET /xyz/abc.html?nocache=true12121 HTTP/1.1" 404 2945 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
[06/Aug/2022:08:37:12 +0000] "GET /content/sitename/us/en/errors/404.html HTTP/1.1" - hit [publishfarm/-] 2ms "localhost"

 

172.17.0.1 "localhost" - [06/Aug/2022:08:54:34 +0000] "GET /content/abc/rgarg.json?nocahce=true HTTP/1.1" 404 2948 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
[06/Aug/2022:08:54:34 +0000] "GET /content/sitename/us/en/errors/404.html HTTP/1.1" - hit [publishfarm/-] 1ms "localhost"

 

Can you think of a reason why it may be behaving that ways ?

 

Thanks again!

Avatar

Community Advisor

Hi,

I would suggest to test your redirect rules here with all use cases which you want to achieve like mapping and all as well.

https://technicalseo.com/tools/htaccess/ 



Arun Patidar

Avatar

Community Advisor

@arunpatidar - Thanks for recommending this tool, its an excellent tool.

However, There is a small issue

My redirect rules are written correctly and working fine as expected on the tool - 

RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^.]*[^./])$ $1.html [R=301,L]

Test URL - https://brandA.com/content/brandA/us/en/home
New URL - https://brandA.com/content/brandA/us/en/home.html

 

However, on the dev environment what happens is as below - 

https://brandA.com/content/brandA/us/en/home
http://brandA.com/content/brandA/us/en/home.html

https://brandA.com/content/brandA/us/en/home.html

 

Basically every redirect rule first moves the request to http before moving it back again to https.

This is leading to 2X requests in the network tab in browser and other issues.

 

My virtual hosts are all running on 80 port as below but this was done to handle http based requests as well- 

<VirtualHost *:80

 

Could you suggest what can cause this https to http and then back to https URL changes ?

Avatar

Community Advisor

@Rohan_Garg 

I am not sure what could be the issue here, it could be some of the redirects which causing this.



Arun Patidar

Avatar

Community Advisor

Hi All,

 

The issue above was present as I was using exact replicas of vhosts in the enabled folders.

In my cloud instance as the enabled folder has symlinks, the above rewrite rules worked correctly.

 

In my local however, I still get the below error when I create symlinks for enabled vhosts using "mklink link target"-

C:\Users\rohan.garg\aem-sdk\dispatcher\aem-sdk-dispatcher-tools-2.0.91-windows\bin>docker_run D:\Project\Branches\Dispatcher_New\aemrepo\dispatcher\src host.docker.internal:4507 80
values.csv not found in deployment folder D:\Project\Branches\Dispatcher_New\aemrepo\dispatcher\src - using files in 'conf.d' and 'conf.dispatcher.d' subfolders directly
processing configuration subfolder 'conf.d'
processing configuration subfolder 'conf.dispatcher.d'
docker: Error response from daemon: mkdir D:\Project\Branches\Dispatcher_New\aemrepo\dispatcher\src\conf.dispatcher.d\enabled_farms\default.farm: Cannot create a file when that file already exists.
See 'docker run --help'.

Avatar

Level 3

Hi @Rohan_Garg ,

 Just after creating the symlink check if the link has been established between "available" and "enabled" folder. To do the symlink you should use "ln -s" command for sharing common location. 

you can delete the file from enabled folder. As it will take from available_farms for having the symlink.

If you still facing issue, then provide the screen shot of your folder structure of conf.dispatcher.d.




Avatar

Community Advisor

Hi All,

 

Previously the redirects were in below format - 

ErrorDocument 404 /errors/404.html
ErrorDocument 500 /errors/500.html
ErrorDocument 502 /errors/500.html

 

Changing them to below format ensured the URLs were also retained while showing page content from below URLs - 

ErrorDocument 404 /errors/404/
ErrorDocument 500 /errors/500/
ErrorDocument 502 /errors/502/

Update - On further refactoring of the redirect rules, specifying the full relative path to DOCROOT also worked.

 

ErrorDocument 404 /content/brandA/us/en/errors/404/
ErrorDocument 500 /content/brandA/us/en/errors/500/
ErrorDocument 502 /content/brandA/us/en/errors/502/

So even though the exact cause is unknown but tweaking the redirect rules solved the issue!