Expand my Community achievements bar.

SOLVED

How to export a list of all URLs on the site

Avatar

Level 1

Hi everyone, I'm wondering: is there a way to do a "URL dump" of all the pages on our site? I've tried the "Create CSV Export" feature, but it's only giving me the page paths -- not something I can share with stakeholders. 

Would love some help if this is something that's actually possible. I've tried Googling and searching here and haven't had any luck.

I don't know whether this is relevant, but I only have authoring/publishing access.

Thanks in advance.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @MsVibey you can give a try to groovy script for this - sample code 

 

import com.day.cq.wcm.api.Page;
import com.day.cq.wcm.api.PageManager;

def pageManager = resourceResolver.adaptTo(PageManager);
def siteaPage = pageManager.getPage("/content/sitea");
def pages = siteaPage.listChildren();

for(def page : pages) {
    println page.getPath()+".html";

//or 

println "domainName" + page.getPath()+".html";
}

 

View solution in original post

7 Replies

Avatar

Community Advisor

Hi @MsVibey 

 

You can use OOB create CSV report feature available on sites to generate the list of pages on your website. Kindly refer to the below link:-

 

https://experienceleague.adobe.com/docs/experience-manager-65/authoring/authoring/csv-export.html?la... 

 

Hope this helps.

Avatar

Level 1

Thanks so much, @Avinash_Gupta_ I have tried that, but just get the path rather than the full URL. Not sure how to get the URL unless I find a formula for Excel that will allow me to change the path in the CSV file to a URL. Which might be an option - I'll check it out.

 

 

Avatar

Community Advisor

From Author that will be the only way , you need to append domain name to construct the full url.

 

Himanshu Jain

Avatar

Level 1

Yes, this is what I do. I pull the CVS report, and then use the find and replace function, command F, in Excel to amend the domain name to get full public-facing URLs. It's a fairly quick way to get a current URL list.

Avatar

Community Advisor

@MsVibey You probably can use query builder queries or so to get the info. you want and export the same too if you wish to customize it based on your need. You could also try groovy scripting.

Avatar

Correct answer by
Community Advisor

Hi @MsVibey you can give a try to groovy script for this - sample code 

 

import com.day.cq.wcm.api.Page;
import com.day.cq.wcm.api.PageManager;

def pageManager = resourceResolver.adaptTo(PageManager);
def siteaPage = pageManager.getPage("/content/sitea");
def pages = siteaPage.listChildren();

for(def page : pages) {
    println page.getPath()+".html";

//or 

println "domainName" + page.getPath()+".html";
}

 

Avatar

Level 1

Thank you both so much. I've saved off this script for someone in the digi team who might be able to do it, but it's beyond me, I'm afraid. I'm a content manager with author/publisher access only, so it's several bridges too far.