Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.
SOLVED

Crawler in AEM

Avatar

Level 5

Hi ,

I am using crawler in AEM and need to fetch the data(title, description) from third party link and in this link we have page navigation(multiple pages)

I am trying to using Jsoup api but through jsoup I am able to fetch the data on one page only.

How can I fetch the data of next page and how can I click the next page and get the data and store in our AEM. Please suggest.

 

Regards,

1 Accepted Solution

Avatar

Correct answer by
Level 10

To perform this task - you need to build a custom AEM service that uses JSOUP API and use application logic to perform this task -- for example: 

https://jsoup.org/cookbook/extracting-data/example-list-links

So because you are using this API within an AEM custom service, the application logic would not change.

To learn how to use JSOUP in an AEM service -- see: 

http://helpx.adobe.com/experience-manager/using/html-parser-service.html

Hope this helps. 

View solution in original post

1 Reply

Avatar

Correct answer by
Level 10

To perform this task - you need to build a custom AEM service that uses JSOUP API and use application logic to perform this task -- for example: 

https://jsoup.org/cookbook/extracting-data/example-list-links

So because you are using this API within an AEM custom service, the application logic would not change.

To learn how to use JSOUP in an AEM service -- see: 

http://helpx.adobe.com/experience-manager/using/html-parser-service.html

Hope this helps.