Expand my Community achievements bar.

Enhance your AEM Assets & Boost Your Development: [AEM Gems | June 19, 2024] Improving the Developer Experience with New APIs and Events

Multi site/domain dispatcher docroot vs common docroot

Avatar

Level 5

Hi,

We tried setting up an multi site - multi domain (using single dispatcher/apache) using multiple virtualhost and farm entries as in docs and have a separate farm entry for dispatcher flush -https://docs.adobe.com/docs/en/dispatcher/disp-domains.html

the docroot is of each site in httpd - virtual host and dispatcher farm is /usr/lib/apache/httpd-2.4.3/htdocs/sitea ,/usr/lib/apache/httpd-2.4.3/htdocs/siteb

dispatcher starts with these configs

farms[farm_sitea].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs/sitea

farms[farm_siteb].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs/siteb

farms[farm_flush].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs

and we see these getting cached under the respective sites docroot 

sitea - /sitea/content/sitea , /sitea/content/dam,/sitea/etc

siteb- /siteb/content/siteb , /siteb/content/dam, /siteb/etc

As an observation these are my questions and all the sites are using MSM and site branding is the same

1) How can we prevent repeated cache of /conten/dam and /etc repeating under each site docroot avoiding (large IO space)  being used?

2) What will be pro and cons of keeping the common docroot as parent level folder instead of specific site level folders

3) What is advantage of keeping site specific docroot as in https://docs.adobe.com/docs/en/dispatcher/disp-domains.html

We tried (2) to have a common docroot in httpd - virtual host and dispatcher farm refer to parent folder - /usr/lib/apache/httpd-2.4.3/htdocs/ 

and the dispatcher starts with these configs

farms[farm_sitea].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs

farms[farm_siteb].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs

farms[farm_flush].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs

We tried (2) and it is caching the /content/sitea and /content/siteb and /etc and /content/dam are not repeated as dispatcher  cache mirrors the content structure of the publish content

Is there any problem with (2) using same parent docroot for all the sites . We also plan to enable higher statsfilelevel so that cache invalidation happens to respective sites

6 Replies

Avatar

Employee Advisor

Hi,

To 1) Having /etc for each site shouldn't be a huge overhead, isn't it? I would rather think that /content/dam has larger files and is driving disk space consumption. Do you have shared DAM content which is used on both SiteA and SiteB? I would rather think, that the invalidation topic is harder to solve (ok, if you use a custom invalidation script you can implement all the logic easily).

To the other questions: I don't get what you mean with "common docroot as parent level folder" and "common docroot" (as mentioned in question 2 and 3). Can you give an example for it?

In the multi-site setup you have to pay special attention to the aspect of dispatcher cache invalidation, as in the simplest approach the invalidation doesn't work.

Jörg

Avatar

Level 5

Hi Jörg,

AEM is multi tenant for site content (SiteA and SiteB) and DAM  (SiteA and SiteB) . we have node separation as here

  • /content/SiteA
  • /content/dam/SiteA

 

  • /content/SiteB
  • /content/dam/SiteB

We have a dispatcher flush farm as in the aem docs

for (2) and (3) as mentioned in my above post we have a common docroot as /usr/lib/apache/httpd-2.4.3/htdocs for both SiteA and SiteA instead of seperate /usr/lib/apache/httpd-2.4.3/htdocs/SiteA and /usr/lib/apache/httpd-2.4.3/htdocs/SiteB docroot 

and hope to control the invalidation via a decent higher statsfilelevel

a) Given the above what is advantage to have separate SiteA and SiteB docroot  ?

b) What is the issue if there no separate SiteA and SiteB docroot  and only have common docroot /usr/lib/apache/httpd-2.4.3/htdocs ?

Avatar

Employee Advisor

Hi,

if you don't use shortened URLs (siteA.com/home.html), but "long" URLS (siteA.com/content/siteA/pages/home.html), then a single docroot is OK, because the dispatcher cache directory structure can directly mimic the structure in the Sling resource tree.

But when you have shortened your URLs, the topic gets more complicated, I talked once about in a "Ask the exports" session [1]. If you visualize the sites and the resources and try to map it to a filesystem structure, it's not identical to the Sling resource tree anymore. And if it's not identical, the default dispatcher invalidation mechanics do not work anymore. Then you have to fix this by writing custom invalidation scripts, which know about your content structure and can do the invalidation based on these.

So to answer your questions:

a) it allows you to make each site cacheable on dispatcher side (and each site having shortened URLs)

b) you either don't use shortened URLs or http://siteA.com/home.html might return the same file as http://siteB.com/home.html

HTH,

Jörg

 

[1] https://communities.adobeconnect.com/p8h87dumxgv/

Avatar

Level 5

Hi Jörg,

Thanks for the details.Addtionally I have also followed these old post https://forums.adobe.com/thread/1082213 which refers to similar dispatcher does not understand sling mapped url by default (Could be for old dispatcher versions ) and recent dispatcher might work well with sling mapped url for multi domain - refering these lines "The /docroot property is set to the path of the root directory of the domain's site content in the Dispatcher cache. This path is used as the prefix for the concatenated URL from the original request. For example, the docroot of /usr/lib/apache/httpd-2.4.3/htdocs/sitea causes the request for http://branda.com/en.html to resolve to the /usr/lib/apache/httpd-2.4.3/htdocs/sitea/en.html file.""

Trying to summarize 2 POV . Pls share your thoughts if both approach can be possible if these taken care

1) For common docroot to work for multi domain & multi locale

docroot - /var/www

within respective vhost have apache rewrites like /$ - /content/siteA/$1 
which in turn request publish with full path - /content/siteA/en-us/home.html rather mapped path and hence create these structures obeying the publish node structure and without any overwrites content gets cached

/content/siteA/en-us/home.html
/content/siteB/en-us/home.html 

Now With a decent statsfilelevel and no separate invalidation farm dispatcher could work and sling mapping in publish takes cares short url links

2) For seperate docroot to work for multi domain

docroot - /var/www/content/siteA
docroot - /var/www/content/siteB

If we use separate document root  as in https://docs.adobe.com/docs/en/dispatcher/disp-domains.html the dispatcher module understands to append the sling mapped url OOTB and without any apache rewrites and separate dispatcher farm is needed for cache invalidation

overall apache mod rewrite is the key to make common docroot to work for multi domain with sling mappings

Avatar

Employee Advisor

I personally prefer to use option 1, as then the request is handled by AEM as sent by the user (http://siteA.com/home.html), while in option 2 the requested path is rewritten on webserver and then forwarded to AEM as http://siteA.com/content/siteA/home.html).

Both approaches work (colleagues of mine use 2 quite often), but prefer to use option 1 because you need to rewrite all links and references anyhow on AEM (and therefor a correct mapping is required), therefor you'll get the rewriting of the incoming request URL almost for free.

Avatar

Level 5

Could you add which of these (common docroot vs separate docroot) refers  to option1 and option2 in your post smiley

And from a DAM cache invalidation POV

1) in a common docroot if we maintain same depth as content tree then same statsfilelevel could handle DAM as well right?

2) in a seperate docroot additional consideration needed as the dam get nested path within docroot? Any thoughts on how DAM cache invalidation can be made effective ?     I can think of custom implementation that can send a dispatcher invalidation request to /content/siteA/content/Dam/../..  upon DAM activation/deactivation