Hi
We are implementing a robots.txt file that needs to be fetched from the dam (for several sites with several languages).
The first issue that we are facing is that we can't access the txt file on the publish environment with the direct path e.g. /content/dam/robots/<sitename>/en_gb/robots.txt.
When we try to put it on a link on a button same result (href attribute gets removed).
The same approach works fine with pdf files.
Does someone knows if we missed something regarding the filetypes that can't be published/accessed on the publish environment?
We are using AEM as a cloud service version : 2022.11.9850.20221116T162329Z.
Thanks
Views
Replies
Total Likes
1. you need to write, redirect rule
2. allow robots.txt from dispatcher filter
Hi Arun
We have enabled the right rules in the dispatcher but we still have the same issue.
As well if I bypass the dispatcher by accessing the publish url:
https://publish-pXXXXX-eXXXXX.adobeaemcloud.com/content/dam/robots/<sitename>/en_gb/robots.txt
It still gives a 404
If I try to access a jpg file it works.
Please check if this helps
I configured everything like mentioned in the ticket
I have added the following entries in filters.any
## Allow version endpoint
/0105 { /type "allow" /method "GET" /url "/robots.txt" }
/0107 { /type "allow" /extension '(txt)' /path "/content/*" }
and added this in my rewrite.rules:
# Rewrite to robots.txt
RewriteCond %{REQUEST_URI} ^/robots.txt$
RewriteRule (.*) /content/dam/robots/mysite/en_gb/robots.txt [PT,L]
When I check the dispatcher logs I see the following entries (but not sure how it comes it doesn't work):
"GET /content/dam/robots/mysite/en_gb/robots.txt" - 0ms [publishfarm/-] [actionblocked] publish-pXXXXX-eXXXXX.adobeaemcloud.com
"GET /content/dam/robots/mysite/en_gb/robots.txt" - 0ms [publishfarm/-] [actionblocked] dev.mysite.com
Hi Arun
Small update on my side.
If I try it on my own local dispatcher instance it is working fine after performing the following chance.
/0105 { /type "allow" /method "GET" /url "/robots" /extension "txt" }
I allowed also all the .txt files coming from my content path.
0100 { /type "allow" /extension '(css|eot|gif|ico|jpeg|jpg|js|gif|pdf|png|svg|swf|ttf|woff|woff2|html|txt)' /path "/content/*" }
I still see in the logs actionblocked.
Your filter rules and rewrite rules are all correct. I can't figure out a reason why the request would be still blocked.
However, I have one suggestion - Can you try to write the below filter rule ?
/0108 { /type "allow" /url "/content/dam/robots/mysite/en_gb/robots.txt" }
Just wanted to see if increased specificity can produce different result.
The issue was resolved by using the following filter:
/0107 { /type "allow" /extension '(txt)' /path "/content/*" }
# Rewrite to robots.txt
RewriteCond %{REQUEST_URI} /robots\.txt$
RewriteRule ^/(.*)$ /content/dam/robots/mysite/en_gb/robots.txt [PT,NC]
Views
Likes
Replies
Views
Likes
Replies