Expand my Community achievements bar.

Radically easy to access on brand approved content for distribution and omnichannel performant delivery. AEM Assets Content Hub and Dynamic Media with OpenAPI capabilities is now GA.
SOLVED

Exclude Content nodes from Search engine indexing

Avatar

Adobe Champion

Is it possible to exclude certain directories in the DAM from Google Search engine indexing?

Is it done at dispatcher level?

Can someone give details on how we can make this possible?

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

you can place this file anywhere in the AEM but it must be served from domain root(This can be achieved using apache/dispatcher redirect rules e.g. https://github.com/arunpatidar02/aemaacs-aemlab/blob/c4e2ab400f561acc3127dba662a2f1f6c397d340/dispat... )

e.g. https://www.mydomain.com/robots.txt



Arun Patidar

View solution in original post

7 Replies

Avatar

Community Advisor

you can try adding those path in robots.txt



Arun Patidar

Avatar

Adobe Champion

@arunpatidar  is that file something residing in publisher or dispatcher? Could you please give me some more details?

 

Avatar

Community Advisor

@P_V_Nair  'robots.txt' would be under path /content/dam/[sitename]

you can add the path under the `Disallow` section.

 

https://experienceleaguecommunities.adobe.com/t5/adobe-experience-manager/robots-txt-file-in-aem-web...

 

Hope this helps!

Avatar

Correct answer by
Community Advisor

you can place this file anywhere in the AEM but it must be served from domain root(This can be achieved using apache/dispatcher redirect rules e.g. https://github.com/arunpatidar02/aemaacs-aemlab/blob/c4e2ab400f561acc3127dba662a2f1f6c397d340/dispat... )

e.g. https://www.mydomain.com/robots.txt



Arun Patidar

Avatar

Community Advisor

As Arun mentioned, we can add the paths which should not be crawled by search engines in robots.txt

 

Robots.txt file should be available at root level (Ex: https://www.example.com/robots.txt)

 

You can have the file available in publisher and add redirection in dispatcher file so that it will load at root level as url provided above.

 

More details of robots.txt can be found in 

https://www.semrush.com/blog/beginners-guide-robots-txt/?kw=&cmp=US_SRCH_DSA_Blog_EN&label=dsa_pagef...

Avatar

Level 7

Yes, it is possible to exclude certain directories in the DAM from Google Search engine indexing in AEM. This can be done by setting the robots.txt file to disallow search engine bots from indexing those directories. Additionally, you can also add a meta tag to the HTML of the pages you want to exclude, which will prevent search engines from indexing them.

Also This can be done at the dispatcher level. You can configure the dispatcher.any file to exclude certain paths or directories from being indexed by Google.