Crawler in AEM | Community
Skip to main content
varuns46785756
Level 5
December 7, 2016
Solved

Crawler in AEM

  • December 7, 2016
  • 1 reply
  • 2412 views

Hi ,

I am using crawler in AEM and need to fetch the data(title, description) from third party link and in this link we have page navigation(multiple pages)

I am trying to using Jsoup api but through jsoup I am able to fetch the data on one page only.

How can I fetch the data of next page and how can I click the next page and get the data and store in our AEM. Please suggest.

 

Regards,

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by smacdonald2008

To perform this task - you need to build a custom AEM service that uses JSOUP API and use application logic to perform this task -- for example: 

https://jsoup.org/cookbook/extracting-data/example-list-links

So because you are using this API within an AEM custom service, the application logic would not change.

To learn how to use JSOUP in an AEM service -- see: 

http://helpx.adobe.com/experience-manager/using/html-parser-service.html

Hope this helps. 

1 reply

smacdonald2008
smacdonald2008Accepted solution
Level 10
December 7, 2016

To perform this task - you need to build a custom AEM service that uses JSOUP API and use application logic to perform this task -- for example: 

https://jsoup.org/cookbook/extracting-data/example-list-links

So because you are using this API within an AEM custom service, the application logic would not change.

To learn how to use JSOUP in an AEM service -- see: 

http://helpx.adobe.com/experience-manager/using/html-parser-service.html

Hope this helps.