Setup two Dispatchers in series arrangement

PreetpalSinghBi

15-07-2020

Hello Guys, Need your suggestions on a Apache/Dispatcher setup. (Please see the attached diagram)

image.png

 

 

 

Scenario
The customer has the standard AEM Publish and Apache web servers (shown as #1 in the diagram) already setup and there are multiple websites hosted here.
For one new site, there is a need to access the site through a new set of web servers (shown as #2 in the diagram) hosted on a 3rd party cloud.
If you look closely, this is a different-than-regular scenario with 2 web servers in the request path. Usually we have just one web layer load balancing AEM publish instances.
I have marked two request flows here - green and red.

 

Green arrow flow:
This is the one I achieved. Web browser >>> Apache (#2) >>> proxies all incoming requests to Apache (#1) >>> Dispatcher (#1) >>> AEM >>> cache >>> Apache >>> Back to client.
The draw back here is the files are cached on the inner layer Apache server (#1) and files travel over network to the client on every request. You will notice Dispatcher (#2) is skipped due to the redirects at Apache (#2).

 

Red arrow flow:
This is the one I think is better as the cache is in the outer layer (#2) and once cached, the next set of requests are quicker and efficient.
What stops from achieving this? The request once processed by Dispatcher (#2) is rejected by Dispatcher (#1) by Dispatcher's design.

 

I do not believe I am the first person to implement this. If you have any suggestions or even better if you have come across this scenario, please provide some pointers and I could try them out. Thank you.

 

Regards,

Preetpal

Apache Dispatcher

Accepted Solutions (1)

Accepted Solutions (1)

Shashi_Mulugu

MVP

17-07-2020

There you go... you can clearly see that in Dispatcher #2, Dispatcher is adding few headers to show the publishers that it is a legitimate request... such as below 

Adding request header: host
Adding request header: Via
Adding request header: X-Forwarded-For
Adding request header: Server-Agent

 

And it will result as something similar to

request.headers[Via] = "1.1 x.x.x.x (dispatcher)"

request.headers[X-Forwarded-For] = "122.225.226.181"

request.headers[Server-Agent] = "Communique-Dispatcher"

 

In Apache Server #1, PT flag only serves for not processing further redirects, not unsettling the request headers.

 

Please unset the above request headers in Apache Server #1  in virtual host entry

 

RequestHeader unset via
RequestHeader unset server-agent

Ex;-https://stackoverflow.com/questions/4428903/remove-basic-authentication-header-with-apache-mod-proxy

 

As of my knowledge those two headers should solve but check  out if you have to block/unset any other headers by enabling debug logging in Apache #1

 

 

Answers (7)

Answers (7)

PreetpalSinghBi

21-07-2020

@Jörg_Hoh This did not work for me. I checked it twice.

I am using Apache 2.4.29 on Ubuntu 18.x for server #2 and Apache 2.4.28 on RHEL 7.5 for server #1.

DispatcherUseServerNameInVia 0

I could still see the similar exceptions as before complaining about declining the call as the request is already served by Dispatcher. I could find any Adobe documentation on this.

 

@Shashi_Mulugu This worked just perfect!

RequestHeader unset via
RequestHeader unset server-agent

I would like to see if there is a standard flag provided by Adobe, otherwise, this solves what I wanted. Thank you so much.

Jörg_Hoh

Employee

20-07-2020

There is an internal "loop detection" built-in into the dispatcher; the dispatcher adds a special header, when it forwards a request to AEM. If the dispatcher receives a request with this special header, it complains and will not forward the request anymore, because it assumes a loop.

You can disable that feature if you add the Apache directive

DispatcherUseServerNameInVia 0

to the httpd configuration of 2 (Third Party).

PreetpalSinghBi

20-07-2020

I will try this.

PreetpalSinghBi

17-07-2020

shashi1223


Outermost Apache (#2) has the Rewrite rules with the [PT] flag to pass on the request to outermost dispatcher (#2) and then further pass the request to innermost Apache (#1) and from there to (dispatcher #1) -> Publish.


The rewrite rules look like this,
RewriteRule ^/$ /content/aem-repository-path/home.html [PT]
RewriteRule ^/index(.html)?$ /content/aem-repository-path/home.html [PT]

....

 

Innermost Apache (#1) vhost file just registers the port number for the site and the dispatcher module. No rewrite rules as it is supposed to be a passthrough.

 

 

dispatcher log for the innermost server (#2)
=====================================
Found farm website for subdomain.domain.com
checking [/]
request URL has no extension: /
cache-action for [/]: NONE
Filter rejects: GET / HTTP/1.1
"GET /" - - 0ms [website/-]
Found farm website for subdomain.domain.com
checking [/]
request URL has no extension: /
cache-action for [/]: NONE
Filter rejects: GET / HTTP/1.1
"GET /" - - 0ms [website/-]
Already forwarded by dispatcher (subdomain.domain.com), declined.
"HEAD /aem-repository-path/home.html" - - 0ms [-/-]


dispatcher log for the outermost server (#2)
=====================================
Reusing socket: 49.xx.yy.123:80
Connected to backend rend01 (49.xx.yy.123:80)
Adding request header: host
Adding request header: Via
Adding request header: X-Forwarded-For
Adding request header: Server-Agent
response.status = 302
response.headers[Date] = "Fri, 17 Jul 2020 23:52:22 GMT"
response.headers[Content-Type] = "text/html; charset=iso-8859-1"
response.headers[X-Frame-Options] = "SAMEORIGIN, SAMEORIGIN"
response.headers[Location] = "https://domain/404.html"
Storing socket for later reuse: 49.xx.yy.123:80
"HEAD /aem-repository-path/home.html" 302 - 176ms [website/rend01]
Found farm website for subdomain.domain.com
checking [/aem-repository-path/home.html]
cachefile does not exist: /app/aem/aem-repository-path/home.html
cache-action for [/aem-repository-path/home.html]: NONE
Reusing socket: 49.xx.yy.123:80
Connected to backend rend01 (49.xx.yy.123:80)
Adding request header: host
Adding request header: Via
Adding request header: X-Forwarded-For
Adding request header: Server-Agent
response.status = 302
response.headers[Date] = "Fri, 17 Jul 2020 23:52:22 GMT"
response.headers[Content-Type] = "text/html; charset=iso-8859-1"
response.headers[X-Frame-Options] = "SAMEORIGIN, SAMEORIGIN"
response.headers[Location] = "https://domain/404.html"
Storing socket for later reuse: 49.xx.yy.123:80
"HEAD /aem-repository-path/home.html" 302 - 190ms [website/rend01]
response.status = 302
response.headers[Date] = "Fri, 17 Jul 2020 23:52:22 GMT"
response.headers[Content-Type] = "text/html; charset=iso-8859-1"
response.headers[X-Frame-Options] = "SAMEORIGIN, SAMEORIGIN"
response.headers[Location] = "https://domain/404.html"
Storing socket for later reuse: 49.xx.yy.123:80
"HEAD /aem-repository-path/home.html" 302 - 174ms [website/rend01]
Found farm website for subdomain.domain.com
checking [/aem-repository-path/home.html]
cachefile does not exist: /app/aem/aem-repository-path/home.html
cache-action for [/aem-repository-path/home.html]: NONE
Reusing socket: 49.xx.yy.123:80

Shashi_Mulugu

MVP

16-07-2020

@PreetpalSinghBi have you tried setting Apache#1 Ip and port inside dispatcher#2 inside dispatcher.any /render section instead of redirects at apache#2?

 

Clients-->Apache#2-->Dispatcher#2-->Apache#1-->Dispatcher#1-->Publisher#1

 

Because more or less CDN infront of dispatcher works this way... Client hits CDN, CDN hits Apache, Apache executes dispatcher, which interns renders from publishers.

 

Assume your new Weblayer as CDN and implement.