Sling job is running after 2 to 3 minutes from job addition

Forum|Forum|7 years ago
January 23, 2019
4 replies
3440 views

We have two to three jobconsumer in publish instance(AEM 6.3 SP1). After adding the job to particular topic(job manager.addJob), Job consumer consuming the same topic is running after 2 to 3 minutes. Same issue exists for other jobconsumers in publish instance. Publish instance is clustered one and using MongoDB(Leader and Non Leader).

Verified that jobs are distributed on both instances and startup delay is set as 30 in Apache sling job manager.

Is there are other configuration need to be changed to run the job atleast within 30 - 40 seconds ?

Is there any way to debug this issue ?

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.

Gaurav-Behl

Level 10

Are the jobs created using default queue or custom queue?

sankar_ramanp36

Jobs created using custom queue(/test/jobs/creation)

Gaurav-Behl

Level 10

Could you check if type of queue or any other configuration is reason for mentioned behavior? Could you share more details about it?

you could also validate 'Apache Sling Job Queue Configuration' for your custom queue and validate thread pool size, maximum parallel jobs and priority configured.

refer -

Apache Sling :: Apache Sling Eventing and Job Handling

`queue.type`	The type of the queue: ORDERED, UNORDERED, TOPIC_ROUND_ROBIN
`queue.topics`	A list of topics processed by this queue. Either the concrete topic is specified or the topic string ends with /* or /. If a star is at the end all topics and sub topics match, with a dot only direct sub topics match.
`queue.maxparallel`	How many jobs can be processed in parallel? -1 for number of processors.
`queue.retries`	How often the job should be retried in case of failure (i.e. Job did not finish with succeeded or cancelled result). -1 for endless retries. In case of exceptions there is no retry.
`queue.retrydelay`	The waiting time in milliseconds between job retries.
`queue.priority`	The thread priority: NORM, MIN, or MAX

joerghoh

Adobe Employee

Sling Jobs in a cluster are transfered to the master through the repository. That means that there is at least a delay in synching repository data. You can easily test this if you add a single node on one instance and the immediately check on another cluster instance when this node appears (use CRXDE). In a cluster this can take some seconds.

On top of that the master node's job manager needs to pickup the job data. Not sure how this is done (some time ago there were regular queries, but that has been dropped in favor of a more stable approach). Not sure how to test this though ...

Jörg

P

pulgupta1

Adobe Employee

@joerghoh My understanding is that UNORDERED queue can run jobs on any node in the cluster and not just the leader node. We have even observed this while running some tests.

On this I do want to confirm if say we have 2 nodes in the cluster each with 4 cores and we have set queue.maxparallel="{Double}-1" then I believe we can run run 2*4 = 8 jobs in parallel? Is this inference valid?

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded