Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
BedrockMission!

Learn More

View all

Sign in to view all badges

Jsoup in aem

Avatar

Avatar
Contributor 2
Level 4
AdobeID24
Level 4

Likes

54 likes

Total Posts

153 posts

Correct Reply

1 solution
Top badges earned
Contributor 2
Validate 10
Validate 1
Ignite 5
Ignite 3
View profile

Avatar
Contributor 2
Level 4
AdobeID24
Level 4

Likes

54 likes

Total Posts

153 posts

Correct Reply

1 solution
Top badges earned
Contributor 2
Validate 10
Validate 1
Ignite 5
Ignite 3
View profile
AdobeID24
Level 4

03-04-2020

I want to get all the url/href used in a aem page .....so i thought to access it through jsoup but it is giving me all loginform 

 

document =jsoup.connect("http://ocalhost:4502/content/we-retail/en-us.html")

 

it means not able to login inside the aem 

 

I tried to create session object that also didnt work .

 

Please suggest what can be the best approch to do that ?

 

I need all urls/href used in an aem page need to create report .

whether its external llink or internal page link if it is in  an aem page i need it in my java ,...need to use for reporting 

Accepted Solutions (1)

Accepted Solutions (1)

Avatar

Avatar
Establish
MVP
Nupur_Jain
MVP

Likes

165 likes

Total Posts

186 posts

Correct Reply

76 solutions
Top badges earned
Establish
Ignite 1
Give Back 5
Give Back 3
Give Back 25
View profile

Avatar
Establish
MVP
Nupur_Jain
MVP

Likes

165 likes

Total Posts

186 posts

Correct Reply

76 solutions
Top badges earned
Establish
Ignite 1
Give Back 5
Give Back 3
Give Back 25
View profile
Nupur_Jain
MVP

16-07-2020

Hi @AdobeID24 

 

you can use java.net.HttpURLConnection to get inputstream of the page. Find the code snippet below:

 

 

        InputStream content = null;
        try {
            URL url = new URL("http://ocalhost:4502/content/we-retail/en-us.html");
            String encoding = Base64.getEncoder()
                    .encodeToString("admin:admin");
            HttpURLConnection connection = (HttpURLConnection) url.openConnection();
            connection.setRequestMethod("GET");
            connection.setDoOutput(true);
            connection.setRequestProperty("Authorization", "Basic " + encoding);
            if (connection.getResponseCode() == 200) {
                content = (InputStream) connection.getInputStream();
            } 
        } catch (Exception io) {
            LOGGER.error("IOException occured {}", io);
        }

 

 

Hope it helps!

Thanks,

Nupur

Answers (1)

Answers (1)

Avatar

Avatar
Give Back 5
Employee
vanegi
Employee

Likes

392 likes

Total Posts

378 posts

Correct Reply

148 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back 10
Give Back
Boost 50
View profile

Avatar
Give Back 5
Employee
vanegi
Employee

Likes

392 likes

Total Posts

378 posts

Correct Reply

148 solutions
Top badges earned
Give Back 5
Give Back 3
Give Back 10
Give Back
Boost 50
View profile
vanegi
Employee

16-07-2020

You can leverage this tool https://adobe-consulting-services.github.io/acs-aem-commons/features/report-builder/configuring.html to create such reports.

Or

 

Create a query on properties like "linkTo" and full text containing "<p><a href="/path">test</a></p>"
to fetch pages containing urls/links.