We tried setting up an multi site - multi domain (using single dispatcher/apache) using multiple virtualhost and farm entries as in docs and have a separate farm entry for dispatcher flush -https://docs.adobe.com/docs/en/dispatcher/disp-domains.html
the docroot is of each site in httpd - virtual host and dispatcher farm is /usr/lib/apache/httpd-2.4.3/htdocs/sitea ,/usr/lib/apache/httpd-2.4.3/htdocs/siteb
dispatcher starts with these configs
farms[farm_sitea].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs/sitea
farms[farm_siteb].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs/siteb
farms[farm_flush].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs
and we see these getting cached under the respective sites docroot
sitea - /sitea/content/sitea , /sitea/content/dam,/sitea/etc
siteb- /siteb/content/siteb , /siteb/content/dam, /siteb/etc
As an observation these are my questions and all the sites are using MSM and site branding is the same
1) How can we prevent repeated cache of /conten/dam and /etc repeating under each site docroot avoiding (large IO space) being used?
2) What will be pro and cons of keeping the common docroot as parent level folder instead of specific site level folders
3) What is advantage of keeping site specific docroot as in https://docs.adobe.com/docs/en/dispatcher/disp-domains.html
We tried (2) to have a common docroot in httpd - virtual host and dispatcher farm refer to parent folder - /usr/lib/apache/httpd-2.4.3/htdocs/
and the dispatcher starts with these configs
farms[farm_sitea].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs
farms[farm_siteb].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs
farms[farm_flush].cache.docroot = /usr/lib/apache/httpd-2.4.3/htdocs
We tried (2) and it is caching the /content/sitea and /content/siteb and /etc and /content/dam are not repeated as dispatcher cache mirrors the content structure of the publish content
Is there any problem with (2) using same parent docroot for all the sites . We also plan to enable higher statsfilelevel so that cache invalidation happens to respective sites
To 1) Having /etc for each site shouldn't be a huge overhead, isn't it? I would rather think that /content/dam has larger files and is driving disk space consumption. Do you have shared DAM content which is used on both SiteA and SiteB? I would rather think, that the invalidation topic is harder to solve (ok, if you use a custom invalidation script you can implement all the logic easily).
To the other questions: I don't get what you mean with "common docroot as parent level folder" and "common docroot" (as mentioned in question 2 and 3). Can you give an example for it?
In the multi-site setup you have to pay special attention to the aspect of dispatcher cache invalidation, as in the simplest approach the invalidation doesn't work.
AEM is multi tenant for site content (SiteA and SiteB) and DAM (SiteA and SiteB) . we have node separation as here
We have a dispatcher flush farm as in the aem docs
for (2) and (3) as mentioned in my above post we have a common docroot as /usr/lib/apache/httpd-2.4.3/htdocs for both SiteA and SiteA instead of seperate /usr/lib/apache/httpd-2.4.3/htdocs/SiteA and /usr/lib/apache/httpd-2.4.3/htdocs/SiteB docroot
and hope to control the invalidation via a decent higher statsfilelevel
a) Given the above what is advantage to have separate SiteA and SiteB docroot ?
b) What is the issue if there no separate SiteA and SiteB docroot and only have common docroot /usr/lib/apache/httpd-2.4.3/htdocs ?
if you don't use shortened URLs (siteA.com/home.html), but "long" URLS (siteA.com/content/siteA/pages/home.html), then a single docroot is OK, because the dispatcher cache directory structure can directly mimic the structure in the Sling resource tree.
But when you have shortened your URLs, the topic gets more complicated, I talked once about in a "Ask the exports" session . If you visualize the sites and the resources and try to map it to a filesystem structure, it's not identical to the Sling resource tree anymore. And if it's not identical, the default dispatcher invalidation mechanics do not work anymore. Then you have to fix this by writing custom invalidation scripts, which know about your content structure and can do the invalidation based on these.
So to answer your questions:
a) it allows you to make each site cacheable on dispatcher side (and each site having shortened URLs)
Thanks for the details.Addtionally I have also followed these old post https://forums.adobe.com/thread/1082213 which refers to similar dispatcher does not understand sling mapped url by default (Could be for old dispatcher versions ) and recent dispatcher might work well with sling mapped url for multi domain - refering these lines "The /docroot property is set to the path of the root directory of the domain's site content in the Dispatcher cache. This path is used as the prefix for the concatenated URL from the original request. For example, the docroot of /usr/lib/apache/httpd-2.4.3/htdocs/sitea causes the request for http://branda.com/en.html to resolve to the /usr/lib/apache/httpd-2.4.3/htdocs/sitea/en.html file.""
Trying to summarize 2 POV . Pls share your thoughts if both approach can be possible if these taken care
1) For common docroot to work for multi domain & multi locale
docroot - /var/www
within respective vhost have apache rewrites like /$ - /content/siteA/$1
which in turn request publish with full path - /content/siteA/en-us/home.html rather mapped path and hence create these structures obeying the publish node structure and without any overwrites content gets cached
Now With a decent statsfilelevel and no separate invalidation farm dispatcher could work and sling mapping in publish takes cares short url links
2) For seperate docroot to work for multi domain
docroot - /var/www/content/siteA
docroot - /var/www/content/siteB
If we use separate document root as in https://docs.adobe.com/docs/en/dispatcher/disp-domains.html the dispatcher module understands to append the sling mapped url OOTB and without any apache rewrites and separate dispatcher farm is needed for cache invalidation
overall apache mod rewrite is the key to make common docroot to work for multi domain with sling mappings
I personally prefer to use option 1, as then the request is handled by AEM as sent by the user (http://siteA.com/home.html), while in option 2 the requested path is rewritten on webserver and then forwarded to AEM as http://siteA.com/content/siteA/home.html).
Both approaches work (colleagues of mine use 2 quite often), but prefer to use option 1 because you need to rewrite all links and references anyhow on AEM (and therefor a correct mapping is required), therefor you'll get the rewriting of the incoming request URL almost for free.
Could you add which of these (common docroot vs separate docroot) refers to option1 and option2 in your post
And from a DAM cache invalidation POV
1) in a common docroot if we maintain same depth as content tree then same statsfilelevel could handle DAM as well right?
2) in a seperate docroot additional consideration needed as the dam get nested path within docroot? Any thoughts on how DAM cache invalidation can be made effective ? I can think of custom implementation that can send a dispatcher invalidation request to /content/siteA/content/Dam/../.. upon DAM activation/deactivation