Expand my Community achievements bar.

SOLVED

AEMaaCS robots.txt

Avatar

Level 5

I have to create robots.txt file for my project.

 

We are using AEMaaCS. My questions are -

1. Where should I put the file? Should it be under /content/dam/project_name or somewhere else?

2. Should it be generated manually or dynamically using a servlet?

 

Are there any good and updated resources to refer? Currently, I am referring the below but these are a bit unclear and pretty old as well.

http://www.wemblog.com/2013/06/how-to-implement-robotstxt-sitemapxml.html

https://adobe-consulting-services.github.io/acs-aem-commons/features/robots-txt/index.html

 

Can someone please guide me in the right direction?

 

 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

In AEM, the robots.txt file should ideally be accessible at the root URL of your website, which is typically in the format https://www.example.com/robots.txt. However, when using AEM as a Cloud Service (AEMaaCS) and considering the content structure, the robots.txt file placed under /content/dam/project_name may not be directly accessible at the root URL.

To ensure the robots.txt file is accessible at the root URL you may need to configure the CDN rules to map the root URL to the location of the robots.txt file. This involves setting up specific rules in your CDN configuration to redirect requests for https://www.example.com/robots.txt to the actual location of the robots.txt file in AEM.

View solution in original post

4 Replies

Avatar

Community Advisor

Hi @goyalkritika ,

 

Refer below blog for the location of the robots.txt file it will give you detailed insight about location

https://www.aemtutorial.info/2020/07/robotstxt-file-in-aem-websites.html

https://www.youtube.com/watch?v=3xy-z41Isws

 

Manual vs. dynamic generation:

  • The choice between manually creating the robots.txt file or generating it dynamically using a servlet depends on your project requirements.
  • If your robots.txt file has static rules that don't change frequently, you can create it manually. Simply create a new file named robots.txt and add the desired rules and directives.
  • If your robots.txt file requires dynamic generation, such as including rules based on configurable properties or content, you can create a servlet that generates the robots.txt file dynamically. The servlet can fetch the necessary data and construct the robots.txt content on-the-fly.

Avatar

Level 5

@MayurSatav based on my project requirements, I'll go with manual creation.

 

So, I created the file under /content/dam/project_name with required permissions and published it. 

 

Should robots.txt be not accessible as domain_name/robots.txt?

In this case, it will not be accessible like this as we will not be doing any masking.

Any insights?

Avatar

Correct answer by
Community Advisor

In AEM, the robots.txt file should ideally be accessible at the root URL of your website, which is typically in the format https://www.example.com/robots.txt. However, when using AEM as a Cloud Service (AEMaaCS) and considering the content structure, the robots.txt file placed under /content/dam/project_name may not be directly accessible at the root URL.

To ensure the robots.txt file is accessible at the root URL you may need to configure the CDN rules to map the root URL to the location of the robots.txt file. This involves setting up specific rules in your CDN configuration to redirect requests for https://www.example.com/robots.txt to the actual location of the robots.txt file in AEM.