Scheduled job to import data | Community
Skip to main content
Level 4
May 14, 2020
Solved

Scheduled job to import data

  • May 14, 2020
  • 2 replies
  • 9228 views

We have a requirement where we need to call external REST service, fetch data and create JCR node entries on certain time in a day.

 

1. Can someone help me to guide what is the best option do achieve that...?

We also use Jenkins in our project which probably can be used to schedule job ...?

 

2. I can write java class to write JCR data, can that java class file be called using scheduler..?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Ankur_Khare

Few questions before you implement a scheduler-

 

1. How frequently you will call rest api.

2. How frequently data could change from rest api.

 

2 replies

bilal_ahmad
Level 5
May 15, 2020

Hi @mayukh007

I'd suggest you create a java scheduler and call your service from that. That way it'd be easy for you to maintain/debug the code. Now coming to the implementation:

1. You need to write a java Scheduler and specify the scheduler.expression when you want it to run(http://www.cronmaker.com/). Inside the run() method, you can call your service implementation methods

2. You need to write a Service class where you'd specify the methods(like fetchGetResponse(), writeDataToJcrNodes() etc.).

3. You need to implement all these methods in your Service Impl class.

 

Above three java classes are the minimum number of java classes you need to write in order to complete your requirement.

---------------------------

Logic to process the response and convert it to JCR nodes:

public void testWriteToJCR(Session session) { String[] pages = { "page=1", "page=2" }; if (session.isLive()) { try { if (session.itemExists("/content/mySite/test-rest")) { LOG.debug("Removing existing node: {}", "/content/mySite/test-rest"); session.getNode("/content/mySite/test-rest").remove(); session.refresh(true); session.save(); } Node jobsRootNode = JcrUtil.createPath("/content/mySite/test-rest", JcrResourceConstants.NT_SLING_ORDERED_FOLDER, session); for (String page: pages) { HttpClient client = HttpClientBuilder.create().build(); String apiUrl= "https://reqres.in/api/users?" + page; HttpGet get = new HttpGet(apiUrl); ResponseHandler < String > responseHandler = new BasicResponseHandler(); String response = client.execute(get, responseHandler); JSONObject jsonResponse = new JSONObject(response); JSONArray jsonArray = jsonResponse.toJSONArray(jsonResponse.names()).optJSONArray(3); Node pageNode = jobsRootNode.addNode(page, NodeType.NT_UNSTRUCTURED); session.save(); for (int i = 0; i < jsonArray.length(); i++) { JSONObject jobObject = jsonArray.getJSONObject(i); Node jobNode = pageNode.addNode(jobObject.get("id").toString(), NodeType.NT_UNSTRUCTURED); Iterator < String > keys = jobObject.keys(); while (keys.hasNext()) { String nextKey = keys.next().toString(); jobNode.setProperty(nextKey, jobObject.get(nextKey).toString()); } } session.save(); if (session.hasPendingChanges()) { session.refresh(true); session.save(); } LOG.debug("RST Response converted to JCR nodes successfully"); } } catch (RepositoryException | IOException | JSONException e) { e.printStackTrace(); } } }

 

Thanks,

Bilal.

Theo_Pendle
Level 8
May 15, 2020

Hi,

On behalf of the community I thank you for providing an in-dept answer with code snippets 🙂

Personally though, I wouldn't session.save() in the try-block until the very end, in case an exception is thrown half-way leaving you with a half the nodes saved and the others not (aka: a messed-up node structure).

@mayukh007 as @bilal_ahmad  says, you should be implementing this as an AEM scheduler, NOT something in Jenkins. If you're looking for documentation, the official Adobe tutorial on schedulers is here: https://helpx.adobe.com/experience-manager/using/aem-first-components1.html#AddJavafilestotheMavenprojectusingEclipsenbspbr

Ankur_Khare
Community Advisor
Ankur_KhareCommunity AdvisorAccepted solution
Community Advisor
May 15, 2020

Few questions before you implement a scheduler-

 

1. How frequently you will call rest api.

2. How frequently data could change from rest api.

 

Mayukh007Author
Level 4
May 15, 2020

1. It will run daily once to start with (not sure if there is manual option to trigger the job if required).

2. Data will change in restapi on daily basis. Due to the nature of the data, we will delete the existing jcr nodes (imported yesterday) and do fresh import of all data. We have endpoint to fetch all data in one call.