Removing .html extension at dispatcher, sling:mapping, and vanity URLs conflict

ManOnTheMoon

26-05-2017

Hi, 

I am working on a website in which we must remove the .html from the url, allow vanity URLs and map content to /content/<application>. Before the dispatcher, mod_rewrite is removing the .html from the URLs with the following rules - (e.g. changes /us/en/home.html to /us/en/home/: 

# Handle request with no slash and no extension
RewriteCond %{REQUEST_URI} !^/content/dam/.*
RewriteCond %{REQUEST_URI} !.*\..*$
RewriteCond %{REQUEST_URI} !.*/$
RewriteRule (.*)$ $1/ [R,QSA]

# Handle requests to pages ending with .html
RewriteCond    %{REQUEST_URI} !^/content/dam/.*
RewriteCond    %{REQUEST_URI} .*.html$
RewriteCond    %{REQUEST_URI} !^/health/.*
RewriteRule    (.*).html$ $1 [R,QSA]

# Handle requests to pages ending with a trailing slash
RewriteCond    %{REQUEST_URI} !^/content/dam
RewriteCond    %{REQUEST_URI} .*/$
RewriteCond    %{REQUEST_URI} !^/$
RewriteRule    ^([^.]+?)/?$ $1.html [PT,L,QSA]

This causes a problem for Vanity URLs as it is rewriting to .html on the last rule above (we don't have .html in our vanity URLs). Also our sling:mapping (/etc/map) is rewriting to /content/

Vanity Path Precendence does not work due to my rewrite at the dispatcher. We will have authors and need to handle this dynamically and do not want to hardcode rules for the vanity urls in /etc/map and Apache mod_rewrite. Does anyone know how to solve this with AEM or any custom solutions to account for these rules? Any ideas?

Accepted Solutions (1)

Accepted Solutions (1)

ManOnTheMoon

30-05-2017

The solution - check if the file ends with an extension that exists on the site before rewriting. Like below. 

RewriteCond %{REQUEST_URI} !\.(html|jpg|jpeg|gif|png|tif|pdf|doc|docx|xls|xlsx|ppt|pptx|swf|css|js|svg)$

Answers (4)

Answers (4)

ManOnTheMoon

30-05-2017

kautuksahni wrote...

Please have a look at this forum threads:

Link:- http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manage...

//     Dispatcher directly does not cache the page without an extension.

       But, one of the use suggested a workaround for this:- add the .html extension in the apache rewrite module.

 

Link:- http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manage...

// http://a***.comsuffix1/suffix2  is not cached on dispatcher, because it does not have an extension. If you manage to have a URL like http://a***.comsuffix1/suffix2.html the dispatcher is able to cache it.

~kautuk

 

I am adding the .html extension in the apache rewrite module - I state that in the first post. The problem is that Vanity URLs will not work with this (they do not have a .html extension) and also sling:mapping will not work either with the vanity URLs (sling:mapping is adding a /content/<application_ID> to each request in AEM. In the dispatcher, this part of the URL is removed for all URLs). 

I am trying to find a solution for the following - at the moment, I can only ensure that urls end like this(/us/en/home/) - Below is what I am looking for: 

1.  Vanity URLs Pass through dispatcher correctly(e.g. /productshome/) - Right now the sling:mapping transforms this to /content/<application_id>/productshome.html - ignoring that it is a vanity URL because the .html gets added by the mod_rewrite - so it does not take precedence.

2. ensure urls are shortened (/content/<application_id> removed)

3. ensure all urls are like this: /us/en/home/ - .html removed and slash appended.

I know this can be done if I account for all vanity URLs in my rewrite rules and sling mappings - but we will have authors authoring our site and adding vanity URLs. It is not maintainable to add a rule each time a vanity URL is added. It seems there is no way to integrate sling:mappings, vanity URLs, and the URL rewrite requirements I have above unless we sync with the authors for each publish and that is not an option. Does anyone have a suggestion?

kautuk_sahni

Community Manager

28-05-2017

Please have a look at this forum threads:

Link:- http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manage...

//     Dispatcher directly does not cache the page without an extension.

       But, one of the use suggested a workaround for this:- add the .html extension in the apache rewrite module.

 

Link:- http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manage...

// http://a***.comsuffix1/suffix2  is not cached on dispatcher, because it does not have an extension. If you manage to have a URL like http://a***.comsuffix1/suffix2.html the dispatcher is able to cache it.

~kautuk

antoniom5495929

28-05-2017

        Hi, If you need to remove extension .html i can suggest you to use the out of the box functionality. Have you already tryied to yse the LinkStateChecker configuration in order to strip the html? Let me know.