To use ootb functionality to generate sitemap using apache sling module should I create some oak indexes?
Right now only on demand option is working for me and scheduled generation only express itself in this warning
14.04.2022 16:20:00.120 *WARN* [sling-default-4-we-retail en-US Sitemaps] org.apache.jackrabbit.oak.query.QueryImpl Traversal query (query without index): select [jcr:path], [jcr:score], * from [nt:base] as a where [sling:sitemapRoot] = true and isdescendantnode(a, '/content/we-retail/global/en') option(index tag [slingSitemaps]) /* xpath: /jcr:root/content/we-retail/global/en//*[@sling:sitemapRoot=true] option(index tag slingSitemaps) */; consider creating an index
Is there a suggested index that I should create to make it work?
I added this one with name as suggested
{
"jcr:primaryType": "oak:QueryIndexDefinition",
"compatVersion": 2,
"includedPaths": [
"/content/we-retail"
],
"seed": -8084877133496368591,
"type": "lucene",
"async": [
"async"
],
"evaluatePathRestrictions": true,
"reindex": false,
"reindexCount": 3,
"indexRules": {
"jcr:primaryType": "nt:unstructured",
"nt:base": {
"jcr:primaryType": "nt:unstructured",
"properties": {
"jcr:primaryType": "nt:unstructured",
"sitemapRoot": {
"jcr:primaryType": "nt:unstructured",
"propertyIndex": true,
"name": "sling:sitemapRoot"
}
}
}
}
}
but nothing change
Solved! Go to Solution.
Views
Replies
Total Likes
I have used Sitemap scheduler here and Apache Sling Sitemap - Sitemap Generator Manager is disabled as shown below -
Generated sitemap is available /var/sitemaps/content/we-retail/us/sitemap.xml on publish instance -
After seeing your post , when I tried I was facing the similar issue.
I have referred following article: Apache Sling Sitemap for AEM 6.5.11 and AEMaaCs – AEM Queries & Solutions (wordpress.com) and created scheduler configuration at /apps/weretail/config.publish/org.apache.sling.sitemap.impl.SitemapScheduler~weretail.cfg.json and published
Now , I am not getting the above warning in error.log file and here is my sitemap.xml file -
Hope this will help. Please review.
Thing is that in your sample site map is generated on demand only. When you disable this option in "Apache Sling Sitemap - Sitemap Generator Manager" you will stop seeing your site map
Sitemaps generated by scheduler should be visible in /var/sitemaps
I have used Sitemap scheduler here and Apache Sling Sitemap - Sitemap Generator Manager is disabled as shown below -
Generated sitemap is available /var/sitemaps/content/we-retail/us/sitemap.xml on publish instance -
Thanks it is working on publish. Should it work same on author?
Is there any suggested oak index we should apply to avoid those long query warnings?
And on dispatcher we should allow access to paths in /var/sitemaps folder? Like /var/sitemaps/content/we-retail/us/es/sitemap.xml ?
Thanks
When I did this exercise I didn't notice sitemap in author.
If we try to access following index link: localhost:4503/content/we-retail/us.sitemap-index.xml , it will give sitemap location something like : <loc>http://localhost:4503/content/we-retail/us.sitemap.xml</loc> as shown below -
Here is my sitemap details -
Now to Allow HTTP requests for the sitemap index and sitemap files. We will do following configuration in dispatcher/src/conf.dispatcher.d/filters/filters.any file.
... # Allow AEM WCM Core Components sitemaps /0200 { /type "allow" /path "/content/*" /selectors '(sitemap-index|sitemap)' /extension "xml" }
As a next step we will have the rewrite rules in place to ensure.xml sitemap HTTP requests are routed to the correct underlying AEM page. If URL shortening is not used, or Sling Mappings are used to achieve URL shortening, then this configuration is not needed.
Rewrite Rule in dispatcher/src/conf.d/rewrites/rewrite.rules
... RewriteCond %{REQUEST_URI} (.html|.jpe?g|.png|.svg|.xml)$ RewriteRule ^/(.*)$ /content/${CONTENT_FOLDER_NAME}/$1 [PT,L]
Apache Sling Sitemap for AEM 6.5.11 and AEMaaCs – AEM Queries & Solutions (wordpress.com) is having all the details.
Hi @broman__pl and @DEBAL_DAS ,
I'm also facing the same issue. Can you please share details on how these transverse queries got fixed in your case?
*WARN* [sling-default-1-My Scheduler] org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor Traversed 10000 nodes with filter Filter(query=select [jcr:path], [jcr:score], * from [nt:base] as a where [sling:sitemapRoot] = true and isdescendantnode(a, '/content/we-retail') option(index tag [slingSitemaps]) /* xpath: /jcr:root/content/we-retail//*[@sling:sitemapRoot=true] option(index tag slingSitemaps) */, path=/content/we-retail//*, property=[:indexTag=[slingSitemaps], sling:sitemapRoot=[true]]); consider creating an index or changing the query
I've created scheduler configuration at /apps/weretail/config.publish/org.apache.sling.sitemap.impl.SitemapScheduler~weretail.cfg.json in my publish instance but still I can these warnings in error.log.
Also, I see that sitemaps are created under var/sitemap folder but http://localhost:4503/content/we-retail/us.sitemap.xml is still not accessible. Is there any other configurations we've to do to make it work?
Regards,
Radha
Please use index if no of node traversal is large, index is now available along with SP set up and release notes as well
is there any link for this?