Query Builder - How to query to get specific href outbound links(internal page path) authored in pages. | Community
Skip to main content
Level 2
January 26, 2024
Solved

Query Builder - How to query to get specific href outbound links(internal page path) authored in pages.

  • January 26, 2024
  • 3 replies
  • 1474 views

Hi Team,

Query Builder - How to query to get specific href outbound links(internal page path) authored in pages.

 

 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Kamal_Kishor

@kummari_dilipku : Can you please confirm how are you authoring "specific href outbound links" in your pages?
If it is through a component and href link value is authored via component field then you can use the component and its property name to find all the occurrences of a specific link value.

path=/content/xyz.... 1_property=sling:resourceType 1_property.value=<component_path_used_to_author_the_href_link i.e /apps/xyz/.../component_name> 2_property=<href_link_property_name> 2_property.value=<specific href_link_property_value>

 thanks.

3 replies

DPrakashRaj
Community Advisor
Community Advisor
January 26, 2024

Surely aem query will help. If you know which all components are having the href value and it only part of some specific node property then you can go with normal page property query and if you are not sure if its part of one component or several components and used extensively with many node property you. A try the result with full text query with path starting with /content/sitename since you mentioned you want only internal link used in href. 
simple full text will look like

path=/content/sitename

fulltext=“/content/sitename/path

orderby=path

p.limit=-1

 

hope that’s help. You can adjust the value as per your requirements 

B_Sravan
Community Advisor
Community Advisor
January 28, 2024

Hi @kummari_dilipku ,

there wouldn't be a direct query to retrieve, I would advise fetching all the pages of your website and then getting the HTML content to find your hrefs:

maybe something like this:

package com.yourpackage.servlets; import org.apache.commons.lang3.StringUtils; import org.apache.felix.scr.annotations.*; import org.apache.sling.api.SlingHttpServletRequest; import org.apache.sling.api.SlingHttpServletResponse; import org.apache.sling.api.servlets.SlingSafeMethodsServlet; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import javax.servlet.ServletException; import java.io.IOException; import java.io.PrintWriter; import java.util.Iterator; @Component(immediate = true, metatype = true) @Service @Properties({ @Property(name = "sling.servlet.paths", value = "/bin/getauthoredhrefs"), @Property(name = "sling.servlet.methods", value = "GET") }) public class AuthoredHrefsServlet extends SlingSafeMethodsServlet { private static final Logger LOGGER = LoggerFactory.getLogger(AuthoredHrefsServlet.class); @Reference private QueryBuilder queryBuilder; @Override protected void doGet(SlingHttpServletRequest request, SlingHttpServletResponse response) throws ServletException, IOException { response.setContentType("text/plain"); PrintWriter writer = response.getWriter(); String path = "/content/my-site"; String templatePath = "/apps/your-app/templates/your-template"; try { // Set up the query Map<String, String> queryMap = new HashMap<>(); queryMap.put("path", path); queryMap.put("type", "cq:Page"); queryMap.put("property", "jcr:content/cq:template"); queryMap.put("property.value", templatePath); // Execute the query Iterator<Resource> result = queryBuilder.createQuery(PredicateGroup.create(queryMap), request.getResourceResolver()) .getResult().getResources(); // Process the result while (result.hasNext()) { Resource pageResource = result.next(); String pageContent = pageResource.adaptTo(ModifiableValueMap.class).get("jcr:content/jcr:data", String.class); // Parse HTML content to extract hrefs extractHrefs(pageContent, writer); } } catch (RepositoryException e) { LOGGER.error("Error querying for pages", e); response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR); writer.write("Error retrieving hrefs."); } } private void extractHrefs(String htmlContent, PrintWriter writer) { // Use a HTML parser or regex to extract hrefs from the HTML content // For simplicity, let's use a regex for demonstration purposes String regex = "href=[\"']([^\"']+)[\"']"; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(htmlContent); while (matcher.find()) { String href = matcher.group(1); writer.println(href); } } }

 please change your regex accordingly as you were looking for internal paths

Kamal_Kishor
Community Advisor
Kamal_KishorCommunity AdvisorAccepted solution
Community Advisor
January 29, 2024

@kummari_dilipku : Can you please confirm how are you authoring "specific href outbound links" in your pages?
If it is through a component and href link value is authored via component field then you can use the component and its property name to find all the occurrences of a specific link value.

path=/content/xyz.... 1_property=sling:resourceType 1_property.value=<component_path_used_to_author_the_href_link i.e /apps/xyz/.../component_name> 2_property=<href_link_property_name> 2_property.value=<specific href_link_property_value>

 thanks.

Level 2
January 29, 2024

ok thank you!