Expand my Community achievements bar.

Guidelines for the Responsible Use of Generative AI in the Experience Cloud Community.

Robots.txt File is cached from AEM dispatcher

Avatar

Level 4

In AEM, we have 5 domains, and we're configuring robots.txt for each of them. We're getting the same robots.txt for all domains for some reason. I think we've double-checked the rewrite rule; it's working well, but robots.txt is cached somewhere. To my knowledge, weren't caching the robots.txt, and we're also getting cache as no store, private in the response headers.

3 Replies

Avatar

Employee Advisor

Hi @kbitra1998 ,
Do you have an entry in sling resource resolver mapping for resolving the robots.txt? If so, if will simply resolve the 1st one irrespective of having multiple domains. Here is how you should implement robots.txt for multiple domains - 

For 5 domains, you will have your 5 virtual host configuration. Make an entry like this in your rewrite rules for different virtual hosts.

RewriteRule ^/robots.txt$ /content/dam/domain1/robots.txt [NC,PT] ##for domain1 in virtual host for domain 1

RewriteRule ^/robots.txt$ /content/dam/domain2/robots.txt [NC,PT] ##for domain2 in virtual host for domain 2

RewriteRule ^/robots.txt$ /content/dam/domain3/robots.txt [NC,PT] ##for domain3 in virtual host for domain 3

RewriteRule ^/robots.txt$ /content/dam/domain4/robots.txt [NC,PT] ##for domain4 in virtual host for domain 4

RewriteRule ^/robots.txt$ /content/dam/domain5/robots.txt [NC,PT] ##for domain5 in virtual host for domain 5

I am using PT flag here, you can keep if to [R] flag initially to check if correct redirection is happening on the browser, and then replace it wilt [PT]. 

 

You can have your robots file cached as well on dispatcher since it doesn't change very frequently. Irrespective you are caching or not, the rewrite rule triggers first and cached location would be different for all these robots files. Here cached location would be something like this - 

/mnt/var/www/html/content/dam/domain1/robots.txt

 

Avatar

Level 4

Apologies for the late response.

 

We used the identical settings, but it seems to be caching someplace; we didn't find the file in the /mnt/var/www/html/content/dam/domain1/robots.txt cache folder either. We suspected that it was cached at the Akamai level, but not at the AEM level. We're collaborating with the folks at Akamai. I'll let you know when I have a solution.

Thank you for your assistance.

 

Avatar

Level 4

Robots.txt is cached at the akamai level. Akamai has updated the rules, now it is working as expected.

 

@Anish-Sinha Thanks for the inputs.