Expand my Community achievements bar.

Dive into Adobe Summit 2024! Explore curated list of AEM sessions & labs, register, connect with experts, ask questions, engage, and share insights. Don't miss the excitement.
SOLVED

I want to reomve all the invalid external links and internal links referred in article, how do I do that

Avatar

Level 1

Hi Community, I am newbie here , looking for help.

I want to know all the invalid external links and invalid internal links i.e, broken links , referred in article, how do I do that?

I see Adobe articles talking about linkchecker
when I see /etc/linkchecker.html , it shows only the external links , when particular link is showing as Invalid, I want to know in which article/page it is referred so that I can go there and update the link or remove it if not required, but /etc/linkchecker.html page not showing the reference.

also, how do i validate internal links, page 1 referring to page2, but lets say I deleted page2, but page 1 still having the reference of page 2 , do we have any utility in AEM to see the internal page references and highlight the deleted pages being referred in any of the page.

 

Many Thanks & Kind Regards,

Kranthi

1 Accepted Solution

Avatar

Correct answer by
Level 4

Hello @kranthivulpe ,

Indeed, the OOTB link checker is not suitable for outputting broken links with referenced pages for the entire site.
You can implement the custom script using Groovy Console or AEM Fiddle to go through the /content, get links from properties and check if resource exists via resourceResolver (for internal links) or to validate an external link via calling it from Groovy/Fiddle script (e.g for groovy println 'http://www.google.com'.toURL().text).

Similar script implementation can be found here

Regards

View solution in original post

3 Replies

Avatar

Employee

Hi @kranthivulpe,

There is an OOTB servlet that will return the list of pages that refer to a particular page or asset.

To check if a page or asset is referenced, use

 

https://localhost:4502/bin/wcm/references?
_charset_=utf-8
&path=<path of the page>
&predicate=wcmcontent
&exact=false

 

The output will be a json response containing an array of references of the name 'pages'. If the page is not referenced it will be an empty array.

This servlet uses ReferenceSearch API [0]. If you need this value as JSON outside of AEM, you can straight away use the OOTB one without having to write your own servlet.

[0]: https://helpx.adobe.com/experience-manager/6-4/sites/developing/using/reference-materials/javadoc/co...


OR

 

For your requirement, like you mentioned, you can leverage the AssetReferenceSearch API [1] which can give the details of Assets used in a page (node of type cq:Page).

[1]: https://helpx.adobe.com/experience-manager/6-4/sites/developing/using/reference-materials/javadoc/co...

 


Sharing here the sample code to accomplish this -

package org.redquark.aem.assets.core;

import java.util.LinkedList;
import java.util.List;
import java.util.Map;

import javax.jcr.Node;
import javax.servlet.Servlet;

import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.SlingHttpServletResponse;
import org.apache.sling.api.servlets.SlingSafeMethodsServlet;
import org.osgi.service.component.annotations.Component;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.day.cq.dam.api.Asset;
import com.day.cq.dam.commons.util.AssetReferenceSearch;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;


@Component(
service = Servlet.class,
property = {
"sling.servlet.methods=GET",
"sling.servlet.resourceTypes=cq/Page",
"sling.servlet.selectors=assetreferences",
"sling.servlet.extensions=json",
"service.ranking=1000"
}
)
public class FindReferencedAssetsServlet extends SlingSafeMethodsServlet {

// Generated serial version UID
private static final long serialVersionUID = 8446564170082865006L;

private final Logger log = LoggerFactory.getLogger(this.getClass());

private static final String DAM_ROOT = "/content/dam";

@Override
protected void doGet(SlingHttpServletRequest request, SlingHttpServletResponse response) {

response.setContentType("application/json");

Gson gson = new GsonBuilder().setPrettyPrinting().create();

try {

// Get the current node reference from the resource object
Node currentNode = request.getResource().adaptTo(Node.class);

if (currentNode == null) {
// Every adaptTo() can return null, so let's handle the case here
// However, it is very unlikely
log.error("Cannot adapt resource {} to a node", request.getResource().getPath());
response.getOutputStream().print(new Gson().toString());

return;
}

// Using AssetReferenceSearch which will do all the work for us
AssetReferenceSearch assetReferenceSearch = new AssetReferenceSearch(currentNode, DAM_ROOT,
request.getResourceResolver());

Map<String, Asset> result = assetReferenceSearch.search();

List<AssetDetails> assetList = new LinkedList<>();

for (String key : result.keySet()) {

Asset asset = result.get(key);

AssetDetails assetDetails = new AssetDetails(asset.getName(), asset.getPath(), asset.getMimeType());

assetList.add(assetDetails);
}

String jsonOutput = gson.toJson(assetList);

response.getOutputStream().println(jsonOutput);
} catch (Exception e) {
log.error(e.getMessage(), e);
}

}
}
The corresponding AssetDetails model class is as follows -

package org.redquark.aem.assets.core;
public class AssetDetails {

private String name;
private String path;
private String mimeType;

/**
* @param name
* @param path
* @param mimeType
*/
public AssetDetails(String name, String path, String mimeType) {
this.name = name;
this.path = path;
this.mimeType = mimeType;
}

/**
* @return the name
*/
public String getName() {
return name;
}

/**
* @param name the name to set
*/
public void setName(String name) {
this.name = name;
}

/**
* @return the path
*/
public String getPath() {
return path;
}

/**
* @param path the path to set
*/
public void setPath(String path) {
this.path = path;
}

/**
* @return the mimeType
*/
public String getMimeType() {
return mimeType;
}

/**
* @param mimeType the mimeType to set
*/
public void setMimeType(String mimeType) {
this.mimeType = mimeType;
}
}
Now, you can invoke this servlet by the following request -

http://localhost:4502/content/we-retail/language-masters/en/men.assetreferences.json.

This will give output in the following format

[
{
"name": "running-trail-man.jpg",
"path": "/content/dam/we-retail/en/activities/running/running-trail-man.jpg",
"mimeType": "image/jpeg"
},
{
"name": "enduro-trail-jump.jpg",
"path": "/content/dam/we-retail/en/activities/biking/enduro-trail-jump.jpg",
"mimeType": "image/jpeg"
},
{
"name": "indoor-practicing.jpg",
"path": "/content/dam/we-retail/en/activities/climbing/indoor-practicing.jpg",
"mimeType": "image/jpeg"
}
]

You can edit the AssetDetails class as per your requirement.


Reference:
http://experience-aem.blogspot.com/2015/07/aem-61-get-references-of-page-or-asset.html

 

Thanks!!

Avatar

Correct answer by
Level 4

Hello @kranthivulpe ,

Indeed, the OOTB link checker is not suitable for outputting broken links with referenced pages for the entire site.
You can implement the custom script using Groovy Console or AEM Fiddle to go through the /content, get links from properties and check if resource exists via resourceResolver (for internal links) or to validate an external link via calling it from Groovy/Fiddle script (e.g for groovy println 'http://www.google.com'.toURL().text).

Similar script implementation can be found here

Regards

Avatar

Community Advisor

@kranthivulpe 

Trying writing some custom groovy scripts to get the data in one go. You can refer my article on developing groovy scripts in AEM.

Also you can try the reference report of ACS commons to get the referenced links.

Nikhil_Kumar_AEM_0-1598344098579.png

 


Thanks,

Nikhil