Scheduled job to import data

Mayukh007

14-05-2020

We have a requirement where we need to call external REST service, fetch data and create JCR node entries on certain time in a day.

 

1. Can someone help me to guide what is the best option do achieve that...?

We also use Jenkins in our project which probably can be used to schedule job ...?

 

2. I can write java class to write JCR data, can that java class file be called using scheduler..?

Accepted Solutions (1)

Accepted Solutions (1)

Ankur_Khare

MVP

14-05-2020

Few questions before you implement a scheduler-

 

1. How frequently you will call rest api.

2. How frequently data could change from rest api.

 

Answers (1)

Answers (1)

bilala23933647

14-05-2020

Hi @Mayukh007

I'd suggest you create a java scheduler and call your service from that. That way it'd be easy for you to maintain/debug the code. Now coming to the implementation:

1. You need to write a java Scheduler and specify the scheduler.expression when you want it to run(http://www.cronmaker.com/). Inside the run() method, you can call your service implementation methods

2. You need to write a Service class where you'd specify the methods(like fetchGetResponse(), writeDataToJcrNodes() etc.).

3. You need to implement all these methods in your Service Impl class.

 

Above three java classes are the minimum number of java classes you need to write in order to complete your requirement.

---------------------------

Logic to process the response and convert it to JCR nodes:

	public void testWriteToJCR(Session session) {

	 String[] pages = {
	  "page=1",
	  "page=2"
	 };

	 if (session.isLive()) {
	  try {
	   if (session.itemExists("/content/mySite/test-rest")) {
	    LOG.debug("Removing existing node: {}", "/content/mySite/test-rest");
	    session.getNode("/content/mySite/test-rest").remove();
	    session.refresh(true);
	    session.save();
	   }
	   Node jobsRootNode = JcrUtil.createPath("/content/mySite/test-rest",
	    JcrResourceConstants.NT_SLING_ORDERED_FOLDER, session);

	   for (String page: pages) {
	    HttpClient client = HttpClientBuilder.create().build();
	    String apiUrl= "https://reqres.in/api/users?" + page;
	    HttpGet get = new HttpGet(apiUrl);
	    ResponseHandler < String > responseHandler = new BasicResponseHandler();
	    String response = client.execute(get, responseHandler);
	    JSONObject jsonResponse = new JSONObject(response);
	    JSONArray jsonArray = jsonResponse.toJSONArray(jsonResponse.names()).optJSONArray(3);
	    Node pageNode = jobsRootNode.addNode(page, NodeType.NT_UNSTRUCTURED);
	    session.save();

	    for (int i = 0; i < jsonArray.length(); i++) {
	     JSONObject jobObject = jsonArray.getJSONObject(i);
	     Node jobNode = pageNode.addNode(jobObject.get("id").toString(), NodeType.NT_UNSTRUCTURED);

	     Iterator < String > keys = jobObject.keys();
	     while (keys.hasNext()) {
	      String nextKey = keys.next().toString();
	      jobNode.setProperty(nextKey, jobObject.get(nextKey).toString());
	     }
	    }
	    session.save();

	    if (session.hasPendingChanges()) {
	     session.refresh(true);
	     session.save();
	    }
	    LOG.debug("RST Response converted to JCR nodes successfully");
	   }
	  } catch (RepositoryException | IOException | JSONException e) {
	   e.printStackTrace();
	  }
	 }
	}

 

Thanks,

Bilal.

Hi,

On behalf of the community I thank you for providing an in-dept answer with code snippets 🙂

Personally though, I wouldn't session.save() in the try-block until the very end, in case an exception is thrown half-way leaving you with a half the nodes saved and the others not (aka: a messed-up node structure).

@Mayukh007 as @bilala23933647  says, you should be implementing this as an AEM scheduler, NOT something in Jenkins. If you're looking for documentation, the official Adobe tutorial on schedulers is here: https://helpx.adobe.com/experience-manager/using/aem-first-components1.html#AddJavafilestotheMavenpr...

Thank you so much Bilal and theop76211228 for your answers and comments. Will give it a try...

@bilala23933647 

 

HI Bilal, I am getting error when getting handle to the session to start the JCR operations after I fetch data from service.

I have a service user with CRUDE operation rights which I will use for JCR operation.

In my Scheduler class inside run method I have this code:

 

String serviceUser = CommonUtil.getProperty(context, QNA_DATALOAD_CONFIG, SERVICE_USER);

--> above code correcting returning me my service user which i configured.

 

ResourceResolver resourceResolver = new ResourceResolverUtil().getResourceResolverViaAcl(resolverFactory,serviceUser);

--> getting error in the above line

 

//getResourceResolverViaAcl method is defined in a class called ResourceResolverUtil().

public ResourceResolver getResourceResolverViaAcl(ResourceResolverFactory resolverFactory, String serviceUser) {

    try {

      Map<String, Object> param = new HashMap<String, Object>();

      param.put(ResourceResolverFactory.SUBSERVICE, serviceUser);

// error occurring in below line

      ResourceResolver resourceResolver = resolverFactory.getServiceResourceResolver(param);

      return resourceResolver;

    }

    catch (Exception e) {

      throw new IllegalStateException(e.getMessage(), e);

    }

  }

 

error.log shows below error:

26.05.2020 00:00:00.012 *ERROR* [sling-default-1-Registered Service.22296] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of job 'ca.manulifeglobal.core.schedulers.QnAScheduledTask@1cc2f04a' with name 'Registered Service.22296' : null
java.lang.IllegalStateException: null
at ca.manulifeglobal.core.util.ResourceResolverUtil.getResourceResolverViaAcl(ResourceResolverUtil.java:36) [ca.manulife.dxp.aem-global:0.4.66.SNAPSHOT]
at ca.manulifeglobal.core.services.impl.QnASchedulerServiceImpl.callQnAMaker(QnASchedulerServiceImpl.java:79) [ca.manulife.dxp.aem-global:0.4.66.SNAPSHOT]
at ca.manulifeglobal.core.schedulers.QnAScheduledTask.run(QnAScheduledTask.java:113) [ca.manulife.dxp.aem-global:0.4.66.SNAPSHOT]
at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:347) [org.apache.sling.commons.scheduler:2.7.2]
at org.quartz.core.JobRunShell.run(JobRunShell.java:202) [org.apache.sling.commons.scheduler:2.7.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException: null
at ca.manulifeglobal.core.util.ResourceResolverUtil.getResourceResolverViaAcl(ResourceResolverUtil.java:32) [ca.manulife.dxp.aem-global:0.4.66.SNAPSHOT]
... 7 common frames omitted
26.05.2020 00:00:00.014 *INFO* [sling-default-5-com.adobe.granite.threaddump.impl.BackupCleaner] com.adobe.granite.threaddump.impl.BackupCleaner All backup(s) successfully removed.
26.05.2020 00:00:00.014 *INFO* [sling-default-4-com.day.cq.dam.similaritysearch.internal.scheduler.PeriodicAutoTaggingJob.4560] com.day.cq.dam.similaritysearch.internal.scheduler.PeriodicAutoTaggingJob Smart Tags not configured. Ignoring periodic job.

 

Any help will be greatly appreciated.