We have two to three jobconsumer in publish instance(AEM 6.3 SP1). After adding the job to particular topic(job manager.addJob), Job consumer consuming the same topic is running after 2 to 3 minutes. Same issue exists for other jobconsumers in publish instance. Publish instance is clustered one and using MongoDB(Leader and Non Leader).
Verified that jobs are distributed on both instances and startup delay is set as 30 in Apache sling job manager.
Is there are other configuration need to be changed to run the job atleast within 30 - 40 seconds ?
Is there any way to debug this issue ?
Views
Replies
Total Likes
Are the jobs created using default queue or custom queue?
Views
Replies
Total Likes
Jobs created using custom queue(/test/jobs/creation)
Views
Replies
Total Likes
Could you check if type of queue or any other configuration is reason for mentioned behavior? Could you share more details about it?
you could also validate 'Apache Sling Job Queue Configuration' for your custom queue and validate thread pool size, maximum parallel jobs and priority configured.
refer -
Apache Sling :: Apache Sling Eventing and Job Handling
queue.type | The type of the queue: ORDERED, UNORDERED, TOPIC_ROUND_ROBIN |
queue.topics | A list of topics processed by this queue. Either the concrete topic is specified or the topic string ends with /* or /. If a star is at the end all topics and sub topics match, with a dot only direct sub topics match. |
queue.maxparallel | How many jobs can be processed in parallel? -1 for number of processors. |
queue.retries | How often the job should be retried in case of failure (i.e. Job did not finish with succeeded or cancelled result). -1 for endless retries. In case of exceptions there is no retry. |
queue.retrydelay | The waiting time in milliseconds between job retries. |
queue.priority | The thread priority: NORM, MIN, or MAX |
Views
Replies
Total Likes
Sling Jobs in a cluster are transfered to the master through the repository. That means that there is at least a delay in synching repository data. You can easily test this if you add a single node on one instance and the immediately check on another cluster instance when this node appears (use CRXDE). In a cluster this can take some seconds.
On top of that the master node's job manager needs to pickup the job data. Not sure how this is done (some time ago there were regular queries, but that has been dropped in favor of a more stable approach). Not sure how to test this though ...
Jörg
Views
Replies
Total Likes
@Jörg_Hoh My understanding is that UNORDERED queue can run jobs on any node in the cluster and not just the leader node. We have even observed this while running some tests.
On this I do want to confirm if say we have 2 nodes in the cluster each with 4 cores and we have set queue.maxparallel="{Double}-1" then I believe we can run run 2*4 = 8 jobs in parallel? Is this inference valid?
Views
Replies
Total Likes
Views
Likes
Replies