Expand my Community achievements bar.

Regarding Link Checker (AEM 6.1 and newer)

Avatar

Level 4

I see lots of discussion about turning it on and off and tweaking it but no meat behind why you’d run it on a production environment.

A. What is the expected behavior of Link checker in a production environment?
                1. Does a broken link prevent the page from being presented?
                2. Does a broken link exclude the link from the page?
                3. Does a broken link substitute something for the broken link? image, etc.

B. Should link checker be enabled in a production environment?
                1. As "linked" to the previous question, why enable it if you've tested your site?
                2. If it is enabled, can the link checker be told to notify someone by email or posting to some other URL (trouble ticket)?

The reason(s) I ask these questions is that the link checker configuration is doing a) scans of the content, and b) network calls to the internet.  These two items take processing and network power away from the application.  If sufficient or comprehensive testing has occurred against deployed applications there is limited value in link checking in a production environment unless it will tell somebody.

Thank you for your time to review and answer my above questions.

3 Replies

Avatar

Level 10

It all depends on your use case whether you deploy on author or production. Here is a good references:

https://sling.apache.org/documentation/bundles/output-rewriting-pipelines-org-apache-sling-rewriter....

As you can see - you can do more than just check links using the Apache Sling Rewriter. If you want to dynamically manipulate HTML pages - you need to deploy on Production. If you want to just check links - then you can leave on author.

Here are more community articles on this subject:

http://aemexperience.blogspot.ca/2015/07/aem-link-checker-fixing-broken-links.html

https://helpx.adobe.com/experience-manager/using/creating-link-rewrite.html (we will update this soon for AEM 6.2) 

Avatar

Level 4

Okay.  Thanks.  This is good info and leads me to the all-might "it depends" which is okay.

So then when you use such a rewriter approach in your application, for example, is there or what is the dependency you declare in the manifest to be sure the rewriter is on?  I ask because say we elect to turn it off for the instance and then a new app on-boards which requires it.  We are then back to configuring the regular expression to exclude other apps...or it doesn't start because the dependent module (rewriter) is disabled, or worse yet we have to dig in logs for app issues for no rewriter on classpath.

I am still interested in other community responses for the questions I posed.  One of my coworkers made a good point that the log monitoring agents could be used to notify someone if a link check fails.  I am still interested in other community responses mainly because we have seen the new apps deployed into production (I assume you meant publish instance) that will create an entire swath of processor peaks for the content scan and a slam on the network for checking links.

With the era of counting bandwidth usages slowly coming upon us (ISPs throttling by network usages and bytes served over a cycle) I think its important that tighter controls are available to limit bandwidth (link check) and cloud processor usage (content scan during a peak site activity).

Avatar

Level 4

The preface text on this page explains my reasons for digging into this.  Thanks for the responses!

https://helpx.adobe.com/experience-manager/kb/DisableLinkChecker.html