I am trying to create a robots.txt file in AEM using the two links --
[1] http://www.wemblog.com/2013/06/how-to-implement-robotstxt-sitemapxml.html
[2] https://forums.adobe.com/thread/896654
Following [1]; the text file does not work at all -- we can create a simple nt:file robots.txt file in crx but when we hit the file -
http://localhost:4502/content/mysite/en/robots.txt - the browser downloads the file instead of displaying the content of the text file on the page.
I also enabled txt rendition (Enable Plain Text) in Apache Sling GET Servlet ; but there was the same result.
Following [2] ; when you print a page property in sightly using ${pageProperties.robotsTextContent} ; the property is printed on one line in the .html page -
User-agent: * Disallow: / New text: q Newer Text:s
while it has been entered in separate lines in the text-area. Line separator is important in robots.txt file. We need the property to be outputted in different lines as they have been authored in page properties. I used @context ='text' and @context ='html' in the sightly file but it printed '\n' as it on the page without line separation.
Somehow; the txt rendition is not working; creating a robotstext.txt.html file and hitting the page after enabling the text rendition using Apache Sling GET Servlet outputs as
http://localhost:4502/content/mysite/en/robots.txt
** Resource dumped by PlainTextRendererServlet** Resource path:/content/mysite/en/robots Resource metadata: {sling.modificationTime=-1, sling.characterEncoding=null, sling.parameterMap={}, sling.contentType=null, sling.creationTime=-1, sling.contentLength=-1, sling.resolutionPath=/content/mysite/en/robots, sling.resolutionPathInfo=.txt} Resource type: cq:Page Resource super type: -
** Resource properties ** jcr:primaryType: cq:Page jcr:createdBy: admin jcr:created: java.util.GregorianCalendar[time=1480233560436,areFieldsSet=true,areAllFieldsSet=true,lenient=false,zone=sun.util.calendar.ZoneInfo[id="GMT+11:00",offset=39600000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2016,MONTH=10,WEEK_OF_YEAR=49,WEEK_OF_MONTH=5,DAY_OF_MONTH=27,DAY_OF_YEAR=332,DAY_OF_WEEK=1,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=6,HOUR_OF_DAY=18,MINUTE=59,SECOND=20,MILLISECOND=436,ZONE_OFFSET=39600000,DST_OFFSET=0]
When disabling the Apache Sling GET Servlet text rendition; it outputs an error.
Did someone else try putting a robots.txt file in AEM? Why is the OOTB text rendition not working properly in AEM?
Any pointers for either of the two approaches will be highly appreciated.
Solved! Go to Solution.
Views
Replies
Total Likes
There should be absolutely nothing wrong with sling(Sling is A.W.E.S.O.M.E)
Look into what you ask sling to do,
Try to divide it into smaller tasks.
1) Print entire string in static sling servlet
2) Read the file from dam and print it in sling servlet
3) Create a selector and following selector read file from dam
Hope it makes sense.
Regards,
Peter
I'm not a developer, but I can tell you what we did. We used a template that we had created for raw content. We created a page under our en_us locale and named it robots. In the page properties, we put robots.txt in the vanity URL. Since this was a custom template, we also had a tab for raw content, where we just pasted our User-Agent: * and Disallow: / paths. Our SEO partners approved this method as well. Not sure if this will help you or not and I could be missing a piece since I'm not a dev but thought I'd throw it out there...good luck!
We place the robots.txt in a folder in the DAM and then have the dispatcher redirect any requests for it to the appropriate sites robots.txt file. Simple and works.
There should be absolutely nothing wrong with sling(Sling is A.W.E.S.O.M.E)
Look into what you ask sling to do,
Try to divide it into smaller tasks.
1) Print entire string in static sling servlet
2) Read the file from dam and print it in sling servlet
3) Create a selector and following selector read file from dam
Hope it makes sense.
Regards,
Peter
Hi,
here some other solution, how we have made it:
We simply use the Power of the Apache Sling Framework (https://sling.apache.org) and their Resource Resolution mechanism (Sling's URL decomposition).
Follow these steps:
Of course you need to check your Webserver and/or also Dispatcher configuration that these requests come through to the AEM instance.
This solution has the advantage that it does not require any special OSGi configuration, that it may easily be integrated into a "Continuous Delivery/Deployment" workflow and that it may be easily extended if the file content should be customizeable by the user.
I tried the following in Apache and it worked :
RewriteCond %{REQUEST_URI} robots.txt$
Header unset Content-Disposition
RewriteRule ^(.*?)$ /content/dam/path$1 [NC,PT]