Expand my Community achievements bar.

SOLVED

Find the links in AEM author having dispatcher Hostname/content/....instead of authoring /content/...

Avatar

Level 6

Hi,

Is there a way we can identify the links in author with dispatcherhostname/content/abc.html instead of "/content/abc.com"

 

For example:

 

Author domain :   http://wcm-xyz-author:4502

Live/dispatcher domain :  https://xyz.com

 

Expected:

In author, we have a link in page, whose target url is  /content/abc/welcome.html

Dispatcher url will be :   https://xyz.com/content/abc/welcome.html

 

Actual:

 

Business authoring link as direct  "https://xyz.com/content/abc/welcome.html" instead of relative path "/content/abc/welcome.html"

 

Is there a way we can find such links where users authored entire dispatcher url as links?

 

Also we have python scripts to run in dispatcher side to find broken links, when we are running the same in author it is asking for AEM author credentials, How we can provide those credentials in script ?

 

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

This is inherently hard, because in most cases it's up to the developer how links are rendered. My recommendation is always to use relative links instead of absolute links (unless you are forced to switch hostnames, of course). In that case you are very flexible.

But in AEM there's hardly a way to find these links, unless you are very strict in creating such links using always the same approach/component (but in that case you would not ask this question), or you create a filter which checks the completely rendered output for such links.

Or even simpler: crawl the whole site and the check it on the crawling side (not on AEM).

 

(Another approach would be to create something similar like the LinkChecker, which just looks at the hostnames of links and records any deviation from the approved list of hostnames. But I am not sure if that's worth the effort.)

View solution in original post

3 Replies

Avatar

Community Advisor

If you have a groovy console installed in AEM then you can easily get the links from RTE and other link component.

You can simply read the properties and use regex for a match.

Or try with fulltext search



Arun Patidar

Avatar

Correct answer by
Employee Advisor

This is inherently hard, because in most cases it's up to the developer how links are rendered. My recommendation is always to use relative links instead of absolute links (unless you are forced to switch hostnames, of course). In that case you are very flexible.

But in AEM there's hardly a way to find these links, unless you are very strict in creating such links using always the same approach/component (but in that case you would not ask this question), or you create a filter which checks the completely rendered output for such links.

Or even simpler: crawl the whole site and the check it on the crawling side (not on AEM).

 

(Another approach would be to create something similar like the LinkChecker, which just looks at the hostnames of links and records any deviation from the approved list of hostnames. But I am not sure if that's worth the effort.)