There are many different types and levels of clustering technologies that can be taken advantage of to ensure applications are highly available. The modern era of virtualized infrastructure has brought about mechanisms that were simply not possible in decades past.
Microsoft’s Hyper-V hypervisor is a modern, powerful, enterprise ready hypervisor that has many capabilities built-in, that allow applications to run in a highly available fashion. However, there is a type of clustering that can be performed at the guest virtual machine level that allows ensuring applications are highly available at a level that can not be achieved alone with simply the hypervisor. This is called guest clustering.
In this two-part post, we will look at what is guest clustering, why it is needed, and also how guest clustering is configured to ensure high availability of applications.
What is Hyper-V Guest Clustering?
If virtual machines are running on a set of Hyper-V hypervisor hosts that are clustered in a Windows Server cluster, why is virtual machine guest clustering needed?
The cluster of Hyper-V hosts ensures the high availability of guest virtual machines. With the Hyper-V role hosted on top of the Windows Failover Clustering functionality of Windows Server, Hyper-V can ensure the virtual machines are resilient in the event of a Hyper-V host failure. This requires shared storage between the hosts. If one of the Hyper-V hosts fails, the virtual machines on that host are restarted on healthy hosts remaining in the Hyper-V cluster. This ensures the virtual machines are resilient to a single host failure.
However, the applications the virtual machines are serving out are affected due to the restart operation of the virtual machine on the healthy host. This means, the time guest virtual machine operating system takes to boot on the healthy host is, the application it was serving out is unavailable. This causes an interruption in the availability of the application, even though the Hyper-V cluster is providing high availability at the virtual machine level.
Guest Clustering is basically the operation of running a “cluster within a cluster”.
In other words, you build a “nested” Windows Server Failover Cluster inside virtual machines. When the Windows Failover Cluster running on virtual machines is affected by a physical Hyper-V host failure, the Windows Failover Cluster functionality kicks in at the application level and assumes ownership of serving the application out using the virtual machine that was not affected by a failed host. This means the application remains available to the users during the restart event of the virtual machine on the failed host. Once the virtual machine restarts on a healthy host, it will simply rejoin the nested Windows Failover Cluster configuration and will once again be able to provide high availability to the application.
Example, if an Exchange Database Availability Group is utilizing servers running on a Hyper-V cluster and one of the cluster nodes fails, the virtual machine affected from the host failure in the DAG would not bring the availability of the Exchange application down.
The Exchange application is protected by the guest clustering technology that allows the application to be resilient from the underlying failure. The Virtual Machine Guest Clustering allows us to extend the high availability afforded to us by the Windows Server Hyper-V Failover Clustering directly to the applications running inside the clustered virtual machines. This functionality allows supporting cluster-aware applications directly inside of the Hyper-V virtual machine workloads.
As we can see, virtual machine guest clustering can add a great deal of redundancy and resiliency to any application.
Components and Requirements of Hyper-V Guest Clustering
As mentioned, there is a bit of complexity that is involved in setting up the Hyper-V guest clustering involving guest virtual machines running in a physical Hyper-V cluster. The Hyper-V guest cluster requires being provisioned in a very similar way to a physical Windows Failover cluster. The requirements are essentially the same as physical host clustering when creating a physical Windows Failover cluster. However, there are a few differences in how the cluster “hosts” running inside the guest virtual machines are configured. The guest virtual machines are of course VMs. This helps to alleviate a few of the hardware concerns that normally need to be validated with a cluster host.
However, you still have the requirements in regards to:
- Shared storage utilizing via Shared VHDX or VHDS (Virtual Hard Disk Set)
- Network Requirements – Cluster network, etc
- Windows Server 2012 R2 and higher running in guest virtual machines
- Windows Failover Clustering feature will need to be installed on both guest virtual machines
- Anti-Affinity Rules to keep guest clustered VMs on separate physical Hyper-V hosts
Shared Storage – Shared VHDX and VHD Set
Perhaps one of the most important aspects of the virtual environment that stands as a requirement is shared storage.
In Windows Server 2012 R2 Microsoft introduced the new Shared VHDX files that can be shared between virtual machines. This is a perfect use case for guest clustering since one of the requirements for creating a Windows Failover Cluster and by extension a guest cluster, is shared storage. However, starting with Windows Server 2016, Microsoft introduced the VHD Set which is a VHD created with a VHDS extension.
The new VHD Set available in Windows Server 2016 addresses some of the shortcomings of the Shared VHDX. Some of those shortcomings included not being able to resize a VM or carry out host backups.
The VHDX Set is a file that contains metadata which allows graceful access to nodes in the guest cluster that are accessing Shared VHDX disk. In Windows Server 2016, the Shared VHDX file can be either a fixed or dynamically expanding automatic VHDX or AVHDX file.
With a VHD Set File, Microsoft has added the ability to make use of a storage snapshot and then update VM configurations in order to reference the correct configuration. The VHDS file is a pointer file that simply contains the checkpoint metadata concerning the Shared VHDX. This configuration file serves as an external configuration that is shared between the two VMs.
In Windows Server 2012 R2, the shared VHDX had a limitation regarding this configuration file as each VM has its own configuration file regarding the Shared VHDX that was updated with metadata. This created a problem when it comes to resynchronizing the data. The centralized aspect of the VHDS or VHD Set allows having one place where updates could be made when there is a change to underlying storage. When it comes to running Guest Clustering as a production mechanism for high-availability, you want to have the ability to properly backup these guest cluster hosts. The VHD Set File makes this possible.
When thinking about the networking requirements for virtual machine guest clustering, there are a few things to be considered.
- As with a physical cluster, you want to separate network paths for internal cluster communication vs client network communications
- Each VM in a virtual machine guest cluster will need to have two virtual network adapters assigned with possibly more adapters depending on storage
- Additionally, if you are using Network Load Balancing along with a guest cluster, you have to enable MAC address spoofing which allows the virtual machines to change the source MAC address in outgoing packets to one that is not assigned to them
With virtual machine guest clustering, you do not want the virtual machines to reside on the same host in the cluster as this would defeat the purpose of the virtual machine guest clustering. Using Anti-Affinity Groups you can ensure that each virtual machine resides on a different host.
You can use PowerShell on a Hyper-V host to configure the Anti-Affinity Group name for each VM that is a member of the same Guest Cluster:
$AntiAffinityGroup = New-Object System.Collections.Specialized.StringCollection
(Get-ClusterGroup “HyperHost1”).AntiAffinityClassNames = $AntiAffinityGroup
Why might organizations not utilize virtual machine guest clustering?
While Guest Clustering is a great way to provide the highest availability to applications, it does come with a few “costs”. These costs are in terms of both complexity and cost of licensing.
- As you have most likely discerned, there are applications that are not supported for clustering inside of a virtualized environment. Each application will be different in terms of supportability in this regard, so make sure to check with the application vendor for the specifics
Some of the examples of cluster-aware applications that are well-suited for virtual machine guest clustering are:
- Continuously available file shares
- DFS Servers, namespaces, etc
- SQL Server
- Network resources such as DHCP
- Virtual machine guest clustering does add quite a bit of complexity to the environment. There is generally much more configuration and other operational considerations involved with virtual machine guest clustering that needs to be considered
- Additionally, organizations must consider the additional licensing costs that come with virtual machine guest clustering. Clustering by default requires multiple servers to cluster the application. Each installation of an application generally requires its own licensing, depending on the application vendor, etc
So, considering the above in mind, organizations, may not choose to create virtual machine guest clustering for every single application but may choose to do so for those applications that are cornerstones for business-critical applications.
Hyper-V Guest Clustering provides a powerful way to provide even greater high availability for applications as opposed to simply virtual machines. Hyper-V clusters provide high availability to the guest virtual machines; however, the failover process involves the virtual machine being restarted on a healthy host. This will include downtime at least long enough for the virtual machine to reboot. With guest clustering, the guest virtual machines are clustered with Windows Failover Clustering, which will provide failover at the application layer and will greatly minimize and even make any failover unnoticeable to the end user or business stakeholders. This drastically increases the high availability of business-critical applications.