Hi everyone, I'm wondering: is there a way to do a "URL dump" of all the pages on our site? I've tried the "Create CSV Export" feature, but it's only giving me the page paths -- not something I can share with stakeholders.
Would love some help if this is something that's actually possible. I've tried Googling and searching here and haven't had any luck.
I don't know whether this is relevant, but I only have authoring/publishing access.
Thanks in advance.
Solved! Go to Solution.
Views
Replies
Total Likes
Hi @MsVibey you can give a try to groovy script for this - sample code
import com.day.cq.wcm.api.Page;
import com.day.cq.wcm.api.PageManager;
def pageManager = resourceResolver.adaptTo(PageManager);
def siteaPage = pageManager.getPage("/content/sitea");
def pages = siteaPage.listChildren();
for(def page : pages) {
println page.getPath()+".html";
//or
println "domainName" + page.getPath()+".html";
}
Hi @MsVibey
You can use OOB create CSV report feature available on sites to generate the list of pages on your website. Kindly refer to the below link:-
Hope this helps.
Thanks so much, @Avinash_Gupta_ I have tried that, but just get the path rather than the full URL. Not sure how to get the URL unless I find a formula for Excel that will allow me to change the path in the CSV file to a URL. Which might be an option - I'll check it out.
From Author that will be the only way , you need to append domain name to construct the full url.
Yes, this is what I do. I pull the CVS report, and then use the find and replace function, command F, in Excel to amend the domain name to get full public-facing URLs. It's a fairly quick way to get a current URL list.
Views
Replies
Total Likes
@MsVibey You probably can use query builder queries or so to get the info. you want and export the same too if you wish to customize it based on your need. You could also try groovy scripting.
Hi @MsVibey you can give a try to groovy script for this - sample code
import com.day.cq.wcm.api.Page;
import com.day.cq.wcm.api.PageManager;
def pageManager = resourceResolver.adaptTo(PageManager);
def siteaPage = pageManager.getPage("/content/sitea");
def pages = siteaPage.listChildren();
for(def page : pages) {
println page.getPath()+".html";
//or
println "domainName" + page.getPath()+".html";
}
Thank you both so much. I've saved off this script for someone in the digi team who might be able to do it, but it's beyond me, I'm afraid. I'm a content manager with author/publisher access only, so it's several bridges too far.
Views
Replies
Total Likes