Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.
SOLVED

Find and Fix Broken Links in AEM

Avatar

Community Advisor

Hello All,

 

Probably bringing back one of the previously asked questions about finding and fixing broken links in AEM. There are some discussions and solutions mentioned eg using groovy script (can't use this currently) and link checker (https://experienceleaguecommunities.adobe.com/t5/adobe-experience-manager/broken-link-scan/qaq-p/220... and some discussions about disabling link checker because of possible performance issues.

 

I tried using the linkchecker tool in AEM 6.5 and it was showing a couple external links that I authored incorrectly but not the internal link that was authored incorrectly. Still with the incorrect external links we can't right away find where it has been used but we can do  query I suppose (involving devs)

 

https://www.tadigital.com/insights/perspectives/identify-and-fix-broken-links-aem-link-checker and https://www.tadigital.com/exchange/link-checker/ looks very promising but I am not sure how it is implemented.

 

Please suggest for following thoughts or any other ideas.

- Periodically, in author go through all nodes (their properties) and check for the authored links and hrefs. Get the paths with broken links or a map of broken link --> affected paths and save it. Add to this a component may be where we can input path/path tree , broken link, updated link

- Keep link checker on in author and utilize etc/linkchecker tool to get the broken links and query the broken links. For broken internal links think of something else

 

Thanks

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

@Shubham_borole,

This is another idea.

If you have nodeJS installed on your machine, there's a tool that I personally use to scrape through an entire website, checking if there are any broken links. If there are any broken links, you should be able to see the highlighted under-tested page, and which link is broken.

https://www.npmjs.com/package/broken-link-checker

The tool is great, you provide it with the root level of the website, and it will keep traversing down the links from the website, and output results. 

 

The example bellow is reporting generated from https://photoshop.com/en: 

BrianKasingli_0-1607686453703.png

Once you download the node plugin, you can simply run something like:

blc https://photoshop.com/en -ro

 

View solution in original post

5 Replies

Avatar

Correct answer by
Community Advisor

@Shubham_borole,

This is another idea.

If you have nodeJS installed on your machine, there's a tool that I personally use to scrape through an entire website, checking if there are any broken links. If there are any broken links, you should be able to see the highlighted under-tested page, and which link is broken.

https://www.npmjs.com/package/broken-link-checker

The tool is great, you provide it with the root level of the website, and it will keep traversing down the links from the website, and output results. 

 

The example bellow is reporting generated from https://photoshop.com/en: 

BrianKasingli_0-1607686453703.png

Once you download the node plugin, you can simply run something like:

blc https://photoshop.com/en -ro

 

Avatar

Community Advisor
@Shubham_borole, if you make sure the sitemap.html exists somewhere on the page, this will guarantee that all pages will be scraped.

Avatar

Employee Advisor

@Shubham_borole  The linkchecker is mostly used to check the external links and does have performance impacts. Moreover it would just check the link but fixing it, is manual. As per your requirement where you need to find the broken link and need a way to update it as well, I would suggest to use sling rewriter pipeline.

Sling provides a way to rewrite the output/generated markup of a page via a pipeline feature and it is activated in AEM by default (used for the AEM-built in link checker and link rewriting features as well). You would have to create a new transformer-type with all the logic that you need to fix the broken link and add it to the rewrite configuration.

Code Examples:

https://helpx.adobe.com/experience-manager/using/aem63_link_rewriter.html#AddJavafilestotheMavenproj...

https://www.flexibledesigns.rs/creating-a-link-rewriter/

http://www.wemblog.com/2011/08/how-to-remove-html-extension-from-url.html

 

 

Avatar

Community Advisor
Thanks, I should have stated about this. I thought this would have to be done in publishers and we will have to keep adding the new broken links and their replacements as we find them. Will think through this approach. Thanks.

Avatar

Level 4

Where to download this link checker tool from TA Digital?