CQ5.6 Disaster Recovery Strategies
Hi All.
I'm trying to develop a disaster recovery strategy for CQ5.6 but it seems like all solutions are ultimately deficient in one way or another. The environment has 1 author and 4 publishers.
These appear to be some of the options:
1) Clustering: This gets me an exact replica of my environment but clustering requires very good network throughput between instances. If my primary and DR instances are on opposite sides of the United States then I can't get the required throughput. And does clustering guarantee that the DR site author and publishers will be in sync? If an activation completes at the primary site just before a major failure, is that activation guaranteed to be properly reflected on both the author and publishers at the DR site? I believe this solution also requires additional CQ5 licenses.
2) Author-to-Author Replication: This solution copes better with the slow network but the entire repository on author needs to be replicated. Is there some way to configure author-to-author replication so that absolutely everything is replicated? And if so, does replication to the DR author instance trigger a replication to the DR publisher? If not, then each primary publisher must replicate to its DR partner. And that doesn't guarantee that the DR author will be in sync with its DR publishers. This solution also requires additional CQ5 licenses.
3) Storage Replication (e.g. via Amazon EBS Volumes): This is more of an offline situation where, in case of a disaster, I spin up new instances at the DR site using the most recent backups. This solution doesn't require additional CQ5 licenses. However, it doesn't guarantee that the DR author will be in sync with its DR publishers. If I want to guarantee synchronization between the DR instances then I have to shut down the primary author (and maybe the publishers as well) before I create backups for each. The downtime could be minimized by replicating the data store separately but I still have to shut down the primary author.
4) Use Database Persistence Manager: This would offload all backups/clustering to the database instead. And it should be possible to achieve full synchronization at the DR site without having to stop either the primary author or the database. The drawback here, however, is that you need to maintain a database and spend money on additional infrastructure.
So, assuming you stay with the tar persistence manager, are there ways to maintain a DR site and guarantee that the DR author is perfectly in sync with all its DR publishers without having to shut down author?
Thanks in advance for any suggestions.
David Frenkiel