Expand my Community achievements bar.

SOLVED

moved content and broken internal links in AEM/CQ - quick search & replace?

Avatar

Level 1

Recently we re-organized a bunch of content on our website. I wasn't involved with the content re-org, but an author noticed  there are hundreds of pages on our website with broken links in author mode, and the links are no longer links on the published site.

When I looked at the a page with the problem, I see the broken link annotation around the link in question.

When I look at the page source, I notice the link is:  

<a href="https://forums.adobe.com/content/xyz/patient-care/public-health/immunizations1/schedules.html"> 
but it should be: 
<a href="https://forums.adobe.com/content/xyz/patient-care/public-health/immunizations/schedules.html">

It looks like an immunizations1 folder/page was created at some point in time and all of the links got updated to this, which was subsequently renamed or deleted. Is there a way to search and replace in author or in CQDE (AEM Developer Environment)? 

1 Accepted Solution

Avatar

Correct answer by
Administrator

Hi 

Then in that case, you can do the following:-

Consider using Groovy console to crawl over the /content/your_site looking for strings starting with /content.

Then use resourceResolver to check if the found path exists. Sample script implementing this algorithm can be found here.

Link:- https://github.com/Citytechinc/cq-groovy-console [Groovy Tool]

Link:- https://gist.github.com/trekawek/72b3515a6641ca5f4b29 [Groovy Script]

// BrokenLinks 

                                                                                            
import javax.jcr.*
 import org.apache.sling.api.resource.*
  
 def ROOT_PATH = '/content/geometrixx'
  
 def extractPaths(p) {
 if (p instanceof Property && p.multiple) {
 p.values.collect { extractPaths(it) }.flatten()
 } else {
 p.string.findAll(/\/content\/[^"]+/)
 }
 }
  
 getNode(ROOT_PATH).recurse { node ->
 node.properties.findAll {it.type == PropertyType.STRING}.each {
 paths = extractPaths(it).findAll { resourceResolver.resolve(it) instanceof NonExistingResource }
 if (!paths.empty) {
 println "Path: ${node.path}"
 println "Broken: ${paths}\n"
 }
 }
 }
 true

Other option is writing own service to achieve the needful/

 

I hope this would be helpful to you.

Thanks and Regards

Kautuk Sahni



Kautuk Sahni

View solution in original post

6 Replies

Avatar

Level 9

Hi,

Not pretty sure, but just a thought.

Probably it would not be possible to search for pages with internal links set to "immunizations1", unless this value is appearing in any of the jcr property. 

May be we have a rough idea as to during which time frame the modifications have happened and check for all the pages modified during that timeframe. This might help identify the list  of pages .

Avatar

Administrator

Hi 

Please have a look at "The External Link Checker" tool:

Link:- https://docs.adobe.com/docs/en/aem/6-1/administer/operations/external-link-checker.html [AEM 6.1]

//

HOW TO VALIDATE EXTERNAL LINKS

To use the external link checker:

  • Open the Tools console.

  • Double-click on External Link Checker (either the right or left pane). A list of all external links is generated.

  • Validate a specific link by selecting it in the list, then clicking Check:

    file

    Information such as:

    • status of the link
    • URL
    • time since the link was last validated
    • time since the link was last available
    • time since the link was last accessed

    is displayed.

  • On the individual content pages invalid links will be shown as broken:

    file

Another Reference Article :- http://aemexperience.blogspot.in/2015/07/aem-link-checker-fixing-broken-links.html

I hope this would be helpful.

Thanks and Regards

Kautuk Sahni



Kautuk Sahni

Avatar

Level 9

Hi Kautuk,

Thanks for the reply. Did not know that such a feature existed.

Avatar

Level 1

This doesn't solve the problem. Please re-read the question. 

The list generated by the External Link Checker is fairly short and doesn't include links to any of the web pages with bad links in the text, nor does it include the broken links that I mentioned, which are internal links in the web page text.

Example page on our published site, the link is gone from the published link, the last Immunization Schedules link at the very end of the page: http://bit.ly/1T70Fdt

The broken link in AEM author is at the bottom of the page. Screen shot below.

This link is broken across hundreds of pages as noted in my original post. I wanted to quickly find and replace all instances of /content/xyz/patient-care/public-health/immunizations1/schedules.html with /content/xyz/patient-care/public-health/immunizations/schedules.html, which would solve this, without requiring the author to go back and edit each page by hand. 

I'll review the other response to the question to see if it helps. 

 

 

Avatar

Level 7

If Kautuk's suggestion is not working for you, write a service which would query all the pages and return the pages/nodes with the broken links. The update the links on those pages/nodes through the code. This should solve your problem.

Thanks

Tuhin

Avatar

Correct answer by
Administrator

Hi 

Then in that case, you can do the following:-

Consider using Groovy console to crawl over the /content/your_site looking for strings starting with /content.

Then use resourceResolver to check if the found path exists. Sample script implementing this algorithm can be found here.

Link:- https://github.com/Citytechinc/cq-groovy-console [Groovy Tool]

Link:- https://gist.github.com/trekawek/72b3515a6641ca5f4b29 [Groovy Script]

// BrokenLinks 

                                                                                            
import javax.jcr.*
 import org.apache.sling.api.resource.*
  
 def ROOT_PATH = '/content/geometrixx'
  
 def extractPaths(p) {
 if (p instanceof Property && p.multiple) {
 p.values.collect { extractPaths(it) }.flatten()
 } else {
 p.string.findAll(/\/content\/[^"]+/)
 }
 }
  
 getNode(ROOT_PATH).recurse { node ->
 node.properties.findAll {it.type == PropertyType.STRING}.each {
 paths = extractPaths(it).findAll { resourceResolver.resolve(it) instanceof NonExistingResource }
 if (!paths.empty) {
 println "Path: ${node.path}"
 println "Broken: ${paths}\n"
 }
 }
 }
 true

Other option is writing own service to achieve the needful/

 

I hope this would be helpful to you.

Thanks and Regards

Kautuk Sahni



Kautuk Sahni