We already discussed about vSphere Replication, how to install it, how to configure it and now it is finally time to replicate VMs.
Replication can be set on individual VMs through a step-by-step guided configuration in which you will be prompted for destination datastore, RPO, replication server to use, wheter to keep or not PIT copies, etc.
As discussed in vSphere Replication Part 2 - Installation my environment comprises two sites (local SiteA and remote SiteB) managed by two different vCenters.
vSphere Replication can only be managed from vSphere Web Client, so login into it and click on vSphere Replication.
Select Home tab, then your local (SiteA) vCenter Server and click Manage.
To verify everything is properly working we need to check that our local Replication Server is connected. Click on vSphere Replication -> Replication Servers.
vSphere Replication Appliance will be listed since as for now it is the only vSphere Replication Server deployed in our environment.
Let's add a target site establishing connection to remote SiteB's vCenter. Click Target Sites.
Enter remote vCenter Server FQDN or IP Address and credentials. Since this is a test deployment and I'm the only user administrating the whole infrastructure I will use Administrator role to connect to remote vCenter. A best practice is to restrict users permissions whether your infrastructure comprises different users or not. For further informations have a look at previous article: vSphere Replication Part 3 - Roles & Permission.
Target site (SiteB) will be listed as connected.
We have completed the initial replication setup, we have linked our local vSphere Replication Server to the remote vSphere Replication Server using vCenters as replication proxyies.
Let's enable replications on single VMs.
Select the VM(s) you need to replicate, right click then select All vSphere Replication Actions -> Configure Replication.
Select vSphere Replication Server location to use for replication. Here you can select both local vSphere Replication Server and remote vSphere Replication Server. I will be using SiteB remote vSphere Replication Server.
Since every location could have more than one vSphere Replication Servers you can use Auto-assign vSphere Replication Server or manually select it. Auto-assign will assign by default the least busy one (i.e. the one that has fewer replications).
Select Target Datastore, this will be replication destination datastore. If you selected a remote vSphere Replication Server remote datastores managed by remote vCenter will be available.
Next select quiescing method for VM. VSS is only enabled if you are replicating a Windows virtual machine. Using VSS while quiescing VM for creating PIT consistent copy to be sent to target datastore provides consistency not only to VM itself but even application-level consistency. If VSS quiescing occurrs correctly data integrity is retained in databases like Exchange, ActiveDirectory, etc.
Next selection is RPO. Recovery Point Objective (RPO) metric indicates how much data your are willing to loss in case of a distaster. vSphere Replication allow RPO from a maximum of 24 hours up to a minimum of 15 minutes. As explained in introduction post more aggressive RPOs cannot be achieved due to the fact that quiescing too frequently a VM could introduce a performance degradation of the VM itself.
RPO selection is a key setting during replication and must be planned carefully. Low RPOs cannot always be achieved due to technical constraints. A key element to consider is the bandwidth available between source site and target site since it affects the time needed to copy changes from SiteA to SiteB therefore RPOs.
Let's discuss it in a more detailed and practical aspect to better clarify this concept. Consider a VM that has the most aggressive RPO allowed by vSphere Replication (15minutes). A common disbelief is that RPO of 15mins means that every 15 minutes virtual machine at source site is replicated over to destination site. This is incorrect, RPO of 15 minutes means that VM state at remote site can be at a maximum of 15mins older in comparison to the virtual machine state at source site. vSphere Replication documentation has a nice example of this concept, let's quote it:
You set the RPO during replication configuration to 15 minutes. If the replication starts at 12:00 and it takes five minutes to transfer to the target site, the instance becomes available on the target site at 12:05, but it reflects the state of the virtual machine at 12:00. The next replication can start no later than 12:10. This replication instance is then available at 12:15 when the first replication instance that started at 12:00 expires.
If you set the RPO to 15 minutes and the replication takes 7.5 minutes to transfer an instance, vSphere Replication transfers an instance all the time. If the replication takes more than 7.5 minutes, the replication encounters periodic RPO violations. For example, if the replication starts at 12:00 and takes 10 minutes to transfer an instance, the replication finishes at 12:10. You can start another replication immediately, but it finishes at 12:20. During the time interval 12:15-12:20, an RPO violation occurs because the latest available instance started at 12:00 and is too old.
Another element to consider when planning RPO is the average amount of modified data in the VM. Since vSphere Replication transfer deltas only you should consider the variation of delta size between two replications. Can available connection bandwidth transfer this delta over to the destination site meeting selected RPO? Is this delta quite constant in its size or VM workload patterns bring in some spikes in data committed to VM disks making these deltas grow bigger and unpredictably therefore potentially preventing connection to complete the transfer of the entire delta during replication?
In conclusion when considering the right RPO for your VMs you must do some math based on available bandwidth between source and target site as well as delta file size increase due to virtual machine workload patterns.
Another feature of vSphere Replication are Point In Time (PIT) instances. These are used to "freeze" the state of a certain virtual machine at a certain moment. By enabling PIT copies you can select how many instances to keep and for how many days.
Once selected RPO we are ready to complete, click Finish.
Click on the replicated VM, under Summary tab a new dashboard called VM Replication will appear.
To monitor current replication configured for vCenter, both incoming or outgoing, go to vCenter -> Monitor -> vSphere Replication.
Other blog posts in vSphere Replication Series:
vSphere Replication Part 1 - Introduction
vSphere Replication Part 2 - Installation
vSphere Replication Part 3 - Roles & Permissions
vSphere Replication Part 4 - Configuration
vSphere Replication Part 5 - Enable Replication
vSphere Replication Part 6 - Perform Recovery
vSphere Replication Part 7 - Provision additional Replication Servers