Sling job is running after 2 to 3 minutes from job addition | Community
Skip to main content
Level 3
January 23, 2019
Question

Sling job is running after 2 to 3 minutes from job addition

  • January 23, 2019
  • 4 replies
  • 3440 views

We have two to three jobconsumer in publish instance(AEM 6.3 SP1). After adding the job to particular topic(job manager.addJob), Job consumer consuming the same topic is running after 2 to 3 minutes. Same issue exists for other jobconsumers in publish instance. Publish instance is clustered one and using MongoDB(Leader and Non Leader).

Verified that jobs are distributed on both instances and startup delay is set as 30 in Apache sling job manager.

Is there are other configuration need to be changed to run the job atleast within 30 - 40 seconds ?

Is there any way to debug this issue ?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

4 replies

Gaurav-Behl
Level 10
January 23, 2019

Are the jobs created using default queue or custom queue?

sankar_ramanp36
January 24, 2019

Jobs created using custom queue(/test/jobs/creation)

Gaurav-Behl
Level 10
January 24, 2019

Could you check if type of queue or any other configuration is reason for mentioned behavior? Could you share more details about it?

you could also validate 'Apache Sling Job Queue Configuration' for your custom queue and validate thread pool size, maximum parallel jobs and priority configured.

refer -

Apache Sling :: Apache Sling Eventing and Job Handling

queue.typeThe type of the queue: ORDERED, UNORDERED, TOPIC_ROUND_ROBIN
queue.topicsA list of topics processed by this queue. Either the concrete topic is specified or the topic string ends with /* or /. If a star is at the end all topics and sub topics match, with a dot only direct sub topics match.
queue.maxparallelHow many jobs can be processed in parallel? -1 for number of processors.
queue.retriesHow often the job should be retried in case of failure (i.e. Job did not finish with succeeded or cancelled result). -1 for endless retries. In case of exceptions there is no retry.
queue.retrydelayThe waiting time in milliseconds between job retries.
queue.priorityThe thread priority: NORM, MIN, or MAX
joerghoh
Adobe Employee
Adobe Employee
January 24, 2019

Sling Jobs in a cluster are transfered to the master through the repository. That means that there is at least a delay in synching repository data. You can easily test this if you add a single node on one instance and the immediately check on another cluster instance when this node appears (use CRXDE). In a cluster this can take some seconds.

On top of that the master node's job manager needs to pickup the job data. Not sure how this is done (some time ago there were regular queries, but that has been dropped in favor of a more stable approach). Not sure how to test this though ...

Jörg

Adobe Employee
June 25, 2024

@joerghoh My understanding is that UNORDERED queue can run jobs on any node in the cluster and not just the leader node. We have even observed this while running some tests.

 

On this I do want to confirm if say we have 2 nodes in the cluster each with 4 cores and we have set queue.maxparallel="{Double}-1" then I believe we can run run 2*4 = 8 jobs in parallel? Is this inference valid?