AEMaaCS Robots.txt points to to default Adobe one, not the custom one | Community
Skip to main content
Level 2
June 18, 2025
Question

AEMaaCS Robots.txt points to to default Adobe one, not the custom one

  • June 18, 2025
  • 4 replies
  • 1076 views

Hello,

 

We are implementing custom robots.txt which will be maintainable by content authors. the file is located in: "/content/dam/mycompany/config/robots.txt"

 

We implemented following Dispatcher changes and deployed to AEMaaCS env:

 

1. filters.any

# Allow robots.txt /0104 { /type "allow" /url "/content/dam/mycompany/config/robots.txt" /extension "txt"} /0105 { /type "allow" /url "/robots.txt" /extension "txt"}

 

2. To open file inline, rather than downloading it: in dispatcher/src/conf.d/available_vhosts

<LocationMatch "^/content/.*/robots\.txt$"> Header unset "Content-Disposition" Header set Content-Disposition inline </LocationMatch

 

3. rewrite.rules:

RewriteCond %{REQUEST_URI} ^/robots\.txt$ RewriteRule .* /content/dam/mycompany/config/robots.txt [PT,L]

 

Still, the Adobe default robots.file is shown, not our custom one.

what seems to be missing?

 

Appreciate your help!

Regards,

Shruti

 

4 replies

Saravanan_Dharmaraj
Community Advisor
Community Advisor
June 18, 2025

@shrutid56001193  When you directly hit the dispatcher URL http://<disp_domain>/robots.txt , does it pick up to your file? check the dispatcher logs too. If hit is processed correctly in dispatcher then the issue might be with CDN , check if there is any redirect there.  

Level 2
June 19, 2025

thanks, yes this seems like cached at CDN level, this is AEM as a Cloud service and we are using Adobe bundled fastly CDN. so when i try publish+dispatcher url, its always with CDN. how to bypass CDN? 

konstantyn_diachenko
Community Advisor
Community Advisor
June 19, 2025

Hi @shrutid56001193 ,

 

For testing purpose try to change your rewrite rule to allow query parameters:

RewriteCond %{REQUEST_URI} ^/robots\.txt RewriteRule .* /content/dam/mycompany/config/robots.txt [PT,L]

 

To bypass CDN cache, try to call /robots.txt?nocache=1.

Kostiantyn Diachenko, Community Advisor, Certified Senior AEM Developer, creator of free AEM VLT Tool, maintainer of AEM Tools plugin.
konstantyn_diachenko
Community Advisor
Community Advisor
June 18, 2025

Hi @shrutid56001193 ,

 

Try to replace your dispatcher filter with the next one:

# Allow robots.txt /0105 { /type "allow" /method "GET" /path "/content/dam/mycompany/config/robots" /extension "txt" }

 

Kostiantyn Diachenko, Community Advisor, Certified Senior AEM Developer, creator of free AEM VLT Tool, maintainer of AEM Tools plugin.
Level 2
June 19, 2025

thanks @konstantyn_diachenko , but it didn't work...

Shashi_Mulugu
Community Advisor
Community Advisor
June 19, 2025

@shrutid56001193  Pls try @konstantyn_diachenko solution,  also try to clear dispatcher cache after any rules change,  you might be serving already cached file..

 

If it doesn't work, start hitting url with publisher directly,  check if dispatcher calling publisher? If yes with what part etc..

Level 2
June 19, 2025

its AEM as a cloud service, so not possible to hit the publisher directly.

October 31, 2025

Hi @shrutid56001193 , did you find a solution?

Level 2
October 31, 2025

There is no clear solution but 2 workarounds:

 

Workaround 1:

Only for dev and stage domains, we can use custom path for robots.txt, e.g. https://publish-pxxxx-exxxx.adobeaemcloud.com/my-path/robots.txt  - which shows custom robots file.

Workaround 2:

Configure custom domain names for dev and stage envs - then /robots txt will point to the custom robots file.

 

We went ahead with Workaround 1: as this is only for dev, stage. not prod.