Expand my Community achievements bar.

Author cluster setup - how to set up slave as read-only

Avatar

Level 2

We are currently running our 5.6.1 author cluster as active-passive "shared nothing" cluster, meaning

  • a loadbalancer sends all traffic to the master instance by default
  • the slave does not get any traffic, it is there for failover purposes only
  • each instance has its own local copy of repository and datastore (shared nothing) and everything has to be synced

 

So far so good.

 

Now what I want to achieve, is that in normal operations the slave only listens and syncs the content (on port 8080 by default) form the master.

It should only sync the repo but never do any other activities, in particular:

  • it should not run workflows
  • it should not replicate content to the publishers (redundant)
  • it should not execute Blueprint LiveCopy actions
  • it should not trigger any timed events
  • etc.

 

All of this should only happen on the master, the slave (as long as it considers itself a slave) should only be on "warm standby" and sync the repo and datastore.

 

It seems though that CQ does not do this out of the box, and I have not found anything in the documentation about how to achieve this.

Is it something that developers have to take into account in their code?

Is it something that can be achieved on a global level, by switching off or configuring some OSGI services?

4 Replies

Avatar

Employee Advisor

Hi,

The behaviour you want is implemented at least partially:

  • All functions based on Sling Jobs (workflows, replication) are not working on the slave node.
  • There are further services, which using the ClusterAware interface to define, that their function is only executed on the master.

By default the MSM is not bound to any of these 2, nor is the sling scheduler.

kind regards,
Jörg

Thanks for that, but then we're doing something wrong I guess.

Because I am seeing some things happen on our slave, like replication, when they should not. E.g. I am seeing stuff like this in the log:

11.02.2015 13:19:41.080 *INFO* [pool-6-thread-27-com_day_cq_replication_job_replicate_10_233_7_5(com/day/cq/replication/job/replicate_10_233_7_5)] com.day.cq.replication.Agent.replicate_10_233_7_5 Replication (ACTIVATE) of /content/example-com/en/press-releases/2015/2/11/hooray-we-did-it successful.

 

So if I see this in the log I assume the slave is replicating to the publisher directly. In our case we call replication from a custom workflow.

But you say neither should be running on the slave so how can that be?

 

Any specific documentation for developers on how to code this properly and what to watch out for?

Avatar

Level 10

In 5.6 & above  the cluster leader determined by discovery runs the workflow jobs by default. Makesure when you setup the cluster the slingId of the master with the lowest slingId. I do not recall hotfix number for this. Reach out to support channel & should be able to help with configuring load distribution in cluster.