Expand my Community achievements bar.

Sling job is running after 2 to 3 minutes from job addition

Avatar

Level 4

We have two to three jobconsumer in publish instance(AEM 6.3 SP1). After adding the job to particular topic(job manager.addJob), Job consumer consuming the same topic is running after 2 to 3 minutes. Same issue exists for other jobconsumers in publish instance. Publish instance is clustered one and using MongoDB(Leader and Non Leader).

Verified that jobs are distributed on both instances and startup delay is set as 30 in Apache sling job manager.

Is there are other configuration need to be changed to run the job atleast within 30 - 40 seconds ?

Is there any way to debug this issue ?

4 Replies

Avatar

Level 10

Are the jobs created using default queue or custom queue?

Avatar

Level 1

Jobs created using custom queue(/test/jobs/creation)

Avatar

Level 10

Could you check if type of queue or any other configuration is reason for mentioned behavior? Could you share more details about it?

you could also validate 'Apache Sling Job Queue Configuration' for your custom queue and validate thread pool size, maximum parallel jobs and priority configured.

refer -

Apache Sling :: Apache Sling Eventing and Job Handling

queue.typeThe type of the queue: ORDERED, UNORDERED, TOPIC_ROUND_ROBIN
queue.topicsA list of topics processed by this queue. Either the concrete topic is specified or the topic string ends with /* or /. If a star is at the end all topics and sub topics match, with a dot only direct sub topics match.
queue.maxparallelHow many jobs can be processed in parallel? -1 for number of processors.
queue.retriesHow often the job should be retried in case of failure (i.e. Job did not finish with succeeded or cancelled result). -1 for endless retries. In case of exceptions there is no retry.
queue.retrydelayThe waiting time in milliseconds between job retries.
queue.priorityThe thread priority: NORM, MIN, or MAX

Avatar

Employee Advisor

Sling Jobs in a cluster are transfered to the master through the repository. That means that there is at least a delay in synching repository data. You can easily test this if you add a single node on one instance and the immediately check on another cluster instance when this node appears (use CRXDE). In a cluster this can take some seconds.

On top of that the master node's job manager needs to pickup the job data. Not sure how this is done (some time ago there were regular queries, but that has been dropped in favor of a more stable approach). Not sure how to test this though ...

Jörg