Hello,
We are implementing custom robots.txt which will be maintainable by content authors. the file is located in: "/content/dam/mycompany/config/robots.txt"
We implemented following Dispatcher changes and deployed to AEMaaCS env:
1. filters.any
# Allow robots.txt
/0104 { /type "allow" /url "/content/dam/mycompany/config/robots.txt" /extension "txt"}
/0105 { /type "allow" /url "/robots.txt" /extension "txt"}
2. To open file inline, rather than downloading it: in dispatcher/src/conf.d/available_vhosts
<LocationMatch "^/content/.*/robots\.txt$">
Header unset "Content-Disposition"
Header set Content-Disposition inline
</LocationMatch
3. rewrite.rules:
RewriteCond %{REQUEST_URI} ^/robots\.txt$
RewriteRule .* /content/dam/mycompany/config/robots.txt [PT,L]
Still, the Adobe default robots.file is shown, not our custom one.
what seems to be missing?
Appreciate your help!
Regards,
Shruti
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
@shrutid56001193 When you directly hit the dispatcher URL http://<disp_domain>/robots.txt , does it pick up to your file? check the dispatcher logs too. If hit is processed correctly in dispatcher then the issue might be with CDN , check if there is any redirect there.
thanks, yes this seems like cached at CDN level, this is AEM as a Cloud service and we are using Adobe bundled fastly CDN. so when i try publish+dispatcher url, its always with CDN. how to bypass CDN?
Views
Replies
Total Likes
Hi @shrutid56001193 ,
For testing purpose try to change your rewrite rule to allow query parameters:
RewriteCond %{REQUEST_URI} ^/robots\.txt
RewriteRule .* /content/dam/mycompany/config/robots.txt [PT,L]
To bypass CDN cache, try to call /robots.txt?nocache=1.
Views
Replies
Total Likes
Still doesn't work - Another observation: /robots.txt - shows Adobe's default file. but //robots.txt - shows the custom one that we want.
Views
Replies
Total Likes
Hi @shrutid56001193 ,
Try to replace your dispatcher filter with the next one:
# Allow robots.txt
/0105 { /type "allow" /method "GET" /path "/content/dam/mycompany/config/robots" /extension "txt" }
thanks @konstantyn_diachenko , but it didn't work...
Views
Replies
Total Likes
@shrutid56001193 Pls try @konstantyn_diachenko solution, also try to clear dispatcher cache after any rules change, you might be serving already cached file..
If it doesn't work, start hitting url with publisher directly, check if dispatcher calling publisher? If yes with what part etc..
Views
Replies
Total Likes
its AEM as a cloud service, so not possible to hit the publisher directly.
Views
Replies
Total Likes
Views
Like
Replies