Expand my Community achievements bar.

SOLVED

Query Builder - How to query to get specific href outbound links(internal page path) authored in pages.

Avatar

Level 2

Hi Team,

Query Builder - How to query to get specific href outbound links(internal page path) authored in pages.

 

 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

@Kummari_DilipKu : Can you please confirm how are you authoring "specific href outbound links" in your pages?
If it is through a component and href link value is authored via component field then you can use the component and its property name to find all the occurrences of a specific link value.

path=/content/xyz....
1_property=sling:resourceType
1_property.value=<component_path_used_to_author_the_href_link i.e /apps/xyz/.../component_name>
2_property=<href_link_property_name>
2_property.value=<specific href_link_property_value>

 thanks.

View solution in original post

4 Replies

Avatar

Community Advisor

Surely aem query will help. If you know which all components are having the href value and it only part of some specific node property then you can go with normal page property query and if you are not sure if its part of one component or several components and used extensively with many node property you. A try the result with full text query with path starting with /content/sitename since you mentioned you want only internal link used in href. 
simple full text will look like

path=/content/sitename

fulltext=“/content/sitename/path

orderby=path

p.limit=-1

 

hope that’s help. You can adjust the value as per your requirements 

Avatar

Community Advisor

Hi @Kummari_DilipKu ,

there wouldn't be a direct query to retrieve, I would advise fetching all the pages of your website and then getting the HTML content to find your hrefs:

maybe something like this:

package com.yourpackage.servlets;

import org.apache.commons.lang3.StringUtils;
import org.apache.felix.scr.annotations.*;
import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.SlingHttpServletResponse;
import org.apache.sling.api.servlets.SlingSafeMethodsServlet;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.servlet.ServletException;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Iterator;

@Component(immediate = true, metatype = true)
@Service
@Properties({
        @Property(name = "sling.servlet.paths", value = "/bin/getauthoredhrefs"),
        @Property(name = "sling.servlet.methods", value = "GET")
})
public class AuthoredHrefsServlet extends SlingSafeMethodsServlet {

    private static final Logger LOGGER = LoggerFactory.getLogger(AuthoredHrefsServlet.class);

    @Reference
    private QueryBuilder queryBuilder;

    @Override
    protected void doGet(SlingHttpServletRequest request, SlingHttpServletResponse response)
            throws ServletException, IOException {
        response.setContentType("text/plain");
        PrintWriter writer = response.getWriter();

        String path = "/content/my-site";
        String templatePath = "/apps/your-app/templates/your-template";

        try {
            // Set up the query
            Map<String, String> queryMap = new HashMap<>();
            queryMap.put("path", path);
            queryMap.put("type", "cq:Page");
            queryMap.put("property", "jcr:content/cq:template");
            queryMap.put("property.value", templatePath);

            // Execute the query
            Iterator<Resource> result = queryBuilder.createQuery(PredicateGroup.create(queryMap), request.getResourceResolver())
                    .getResult().getResources();

            // Process the result
            while (result.hasNext()) {
                Resource pageResource = result.next();
                String pageContent = pageResource.adaptTo(ModifiableValueMap.class).get("jcr:content/jcr:data", String.class);

                // Parse HTML content to extract hrefs
                extractHrefs(pageContent, writer);
            }

        } catch (RepositoryException e) {
            LOGGER.error("Error querying for pages", e);
            response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
            writer.write("Error retrieving hrefs.");
        }
    }

    private void extractHrefs(String htmlContent, PrintWriter writer) {
        // Use a HTML parser or regex to extract hrefs from the HTML content
        // For simplicity, let's use a regex for demonstration purposes
        String regex = "href=[\"']([^\"']+)[\"']";
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(htmlContent);

        while (matcher.find()) {
            String href = matcher.group(1);
            writer.println(href);
        }
    }
}

 please change your regex accordingly as you were looking for internal paths

Avatar

Correct answer by
Community Advisor

@Kummari_DilipKu : Can you please confirm how are you authoring "specific href outbound links" in your pages?
If it is through a component and href link value is authored via component field then you can use the component and its property name to find all the occurrences of a specific link value.

path=/content/xyz....
1_property=sling:resourceType
1_property.value=<component_path_used_to_author_the_href_link i.e /apps/xyz/.../component_name>
2_property=<href_link_property_name>
2_property.value=<specific href_link_property_value>

 thanks.