Expand my Community achievements bar.

SOLVED

\/etc/map.publish, URL rewriting, one dispatcher, and multiple domains pointing to the same web site: is it possible?

Avatar

Level 7

Hi,

Let's assume that we have a website stored under /content/acme/site-1.

We want this site to be accessible via preview-site-1.acmesites.com as well as site-1.com.

Our deployment has only one dispatcher in front of publisher instances.

Right now this is causing some issues for us: As content for both preview-site-1.acmesites.com and site-1.com are both stored under, e.g., /htdocs/acme/site-1, the content for the domains are over-writing each other.

So when there's a link such as <a href="https://forums.adobe.com/content/acme/site-1/contact-us">Contact Us</a> in the home page, for example, depending on the order of the entries in /etc/map.publish it might be transformed into 

  • <a href="site-1.com/site-1/contact-us">Contact Us</a>, or 
  • <a href="preview-site-1.acmesites.com/contact-us">Contact Us</a>.

And the same link will be used on both web sites.

So the user might go to site-1.com, click on the link, and unexpectedly be transferred to preview-site-1.acmesites.com/contact-us.

I have a couple of questions:

  • Is /etc/map-based URL rewriting powerful enough to handle the scenario above with only one dispatcher instance?
  • If so, what am I missing? How should I configure AEM for the scenario above?

Plus: somehow similar, is it possible to configure /etc/map so that both site-1.com (without www) as well as www.site-1.com can be processed by the URL rewriting system, without having duplicate sling:Mapping  entries for the naked and non-naked domains?

Thanks in advance.

1 Accepted Solution

Avatar

Correct answer by
Level 1

This has been a long standing problem with AEM and it's capability to handle what really should be , standard requirements for URL re-writes and URL Shortening. 

I think they dropped the ball on this one, on what is overall a very powerful CMS 

What you are talking about is correct: In AEM , the ETC maps, will take the first entry they find that matches the content path, regardless of whatever entries there are for that path . It would be ideal for AEM to use the Sling method to match the incoming request header but it takes the content path . This is fine for most circumstances where we have a standardized URL naming for a brand new website, but troublesome when we have legacy URLs over the years that need to map to the content 

To work-around your issue, you may want to consider 

1) Different virtual hosts with different folders for Cache. This way preview-site-1.acmesites.com and www.site-1.com will have two separate cache folders. Internally they may be mapping to the same content path . This will be tricky to manage as we increase the number of paths but can still be done. 

2) Have a Reverse proxy in front of the dispatcher that can do FAST URL inspection and transform it to what you want before sending it on to Dispatcher

You will still have a problem for URL Shortening - i.e. links coming out of the AEM . They can only be shortened based on one entry for a path, and to be honest, this will be a business decision / solution question. Ultimately, one website will be presented to users. You will need to take a call what will the links look link on that page. Even if you do URL re-writing on the way out after dispatcher, ( not recommended ), which domain would you re-write to ? 

Hope this gives some food for thought for further solutioning. 

View solution in original post

1 Reply

Avatar

Correct answer by
Level 1

This has been a long standing problem with AEM and it's capability to handle what really should be , standard requirements for URL re-writes and URL Shortening. 

I think they dropped the ball on this one, on what is overall a very powerful CMS 

What you are talking about is correct: In AEM , the ETC maps, will take the first entry they find that matches the content path, regardless of whatever entries there are for that path . It would be ideal for AEM to use the Sling method to match the incoming request header but it takes the content path . This is fine for most circumstances where we have a standardized URL naming for a brand new website, but troublesome when we have legacy URLs over the years that need to map to the content 

To work-around your issue, you may want to consider 

1) Different virtual hosts with different folders for Cache. This way preview-site-1.acmesites.com and www.site-1.com will have two separate cache folders. Internally they may be mapping to the same content path . This will be tricky to manage as we increase the number of paths but can still be done. 

2) Have a Reverse proxy in front of the dispatcher that can do FAST URL inspection and transform it to what you want before sending it on to Dispatcher

You will still have a problem for URL Shortening - i.e. links coming out of the AEM . They can only be shortened based on one entry for a path, and to be honest, this will be a business decision / solution question. Ultimately, one website will be presented to users. You will need to take a call what will the links look link on that page. Even if you do URL re-writing on the way out after dispatcher, ( not recommended ), which domain would you re-write to ? 

Hope this gives some food for thought for further solutioning.