questions on OOTB sitemap generator | Community
Skip to main content
jayv25585659
Level 8
July 31, 2024
Solved

questions on OOTB sitemap generator

  • July 31, 2024
  • 2 replies
  • 744 views

question 1:

Do I need to customize the resourceTypes to include my custom ones? Or is the default value already there enough to cover all pages whose a descendent of "wcm/foundation/components/basicpage/v1/basicpage"?

 

example: my editable template has a resourceSuperType of "core/wcm/components/page/v1/page". And "core/wcm/components/page/v1/page" has a resourceSuperType of "wcm/foundation/components/basicpage/v1/basicpage"

 

 

question2:

since the URL to access the sitemap seems to be "not standard" (example: http://localhost:4502/content/mysite/en.sitemap.xml), does it mean I need to write some Apache rewrite rules so it's accessible using https://www.mysite.com/sitemap.xml? (or maybe it's easier to just use robots.txt to tell crawlers my sitemal URL?)

Thanks a lot.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by arunpatidar

Hi @jayv25585659 
Please check:

Answer 1: You need to provide a Sling resource type of en page; only then will en.sitemap.xml resolve to your page.

 

Answer 2: Yes and no. It depends on how the crawler will access the sitemap for crawling. I suggest removing internal AEM content paths from the sitemap so that both content and sitemap use the same content mapping.

Example URLs:
- https://site.ch/de-ch.sitemap-index.xml
- https://site.ch/de-ch.sitemap.xml
- https://site.ch/de-ch.sitemap.products-sitemap.xml
- https://site.ch/fr-ch.sitemap-index.xml
- https://site.ch/fr-ch.sitemap.xml

2 replies

arunpatidar
Community Advisor
arunpatidarCommunity AdvisorAccepted solution
Community Advisor
July 31, 2024

Hi @jayv25585659 
Please check:

Answer 1: You need to provide a Sling resource type of en page; only then will en.sitemap.xml resolve to your page.

 

Answer 2: Yes and no. It depends on how the crawler will access the sitemap for crawling. I suggest removing internal AEM content paths from the sitemap so that both content and sitemap use the same content mapping.

Example URLs:
- https://site.ch/de-ch.sitemap-index.xml
- https://site.ch/de-ch.sitemap.xml
- https://site.ch/de-ch.sitemap.products-sitemap.xml
- https://site.ch/fr-ch.sitemap-index.xml
- https://site.ch/fr-ch.sitemap.xml

Arun Patidar
narendiran_ravi
Level 6
August 1, 2024

Answer to Question 1 - All page components inherited from the core page component will be included. So, no need to update this OOB property. So, if your root page is a descendent of the core page component then it is fine.

 

Answer to Question 2 - Yes, we may need to rewrite the URL based on the SEO needs. If the SEO requirement is okay to have a URL with a sitemap selector then we can update the robots.txt. Otherwise, rewrite rule is recommended

RewriteCond %{REQUEST_URI} ^/en/sitemap.xml/?$
RewriteRule ^(.*)$ /content/${CONTENT_FOLDER_NAME}/en.sitemap.xml [PT,L]