One of the evolutions of technologies with Windows Server in general and Windows Server Hyper-V, in particular, is in the realm of storage.
New storage technologies have created interesting capabilities that allow organizations to solve very complex and demanding business challenges and problems they face. Windows Server 2019 and Windows Server 2019 Hyper-V have some pretty incredible new storage abilities that no doubt administrators will want to note and take advantage of.
One of the existing areas of improvements made with Windows Server 2019 is in the area of data deduplication.
In this post, we will take a look at Windows Server 2019 Hyper-V ReFS Deduplication to see how new features with Microsoft’s latest file system ReFS have created great new capabilities that can be utilized in the realm of Hyper-V virtualization.
Let’s begin by taking a step back and looking at data deduplication and ReFS in general.
What is Data Deduplication?
Most likely if you have been around storage or virtualization technologies for any length of time, you have no doubt heard about data deduplication or deduplication for short. This can certainly be used as a buzzword among storage vendors. However, it is an important technology feature of today’s modern storage solutions.
What exactly is deduplication?
Most of today’s complex storage systems store data in various chunks of data using different technologies. When storing very similar servers, files, and other stored items, it is very probable you will have multiple bits of data that could be identical between a number of servers stored on a volume such as is found in storage backing a Hyper-V virtualized environment.
“Deduplication is the ability to recognize and find these identical pieces of information that exist between the bits on disk and get rid of the extra copies of them.”
Example: If you had the same identical set of files that were found in 20 different servers, you could effectively eliminate 19 copies and only keep 1 copy that could be used for all 20.
This enables a much more efficient storage system from a space perspective and prevents unnecessarily keeping extra copies of identical data on expensive storage subsystems when there is no need for that. Especially in the realm of virtualized systems when there tends to be a tremendous amount of duplication between the server operating systems contained therein, there can potentially be HUGE space savings with deduplication.
Microsoft’s ReFS File System – No Longer Handicapped
Microsoft introduced a brand new and exciting file system technology with Windows Server 2016 called ReFS or Resilient File System.
This came as a surprise since NTFS has been the default file system on Windows Server operating systems since the very beginning. However, ReFS touted some pretty incredible resiliency as well as performance advantages of ReFS when compared to NTFS.
Is ReFS better than NTFS?
ReFS is not mean to be a replacement of NTFS however. Why not? At least for now, ReFS is not bootable. This is a huge show stopper for ReFS totally replacing NTFS and other file systems. Its design is not so much for running the operating system itself but rather for the secondary utility storage that is often presented to workloads.
Learn more in detail here: Windows Server 2016 Hyper-V ReFS vs NTFS
One of the primary purposes of ReFS is to be used with Storage Spaces Direct (S2D) as the file system for formatting the S2D volumes to be used for Hyper-V virtual machine storage.
There are several advantages to ReFS as introduced with Windows Server 2016. These include:
- Has a maximum volume size of 1 Yottabyte or 1 trillion terabytes
- Uses file metadata to protect file system health and provides other features:
- This is used when extending a disk or zeroing out new blocks on a disk
- Leads to extreme performance enhancements of ReFS
- Operations are performed on the metadata and not the actual data on disk such as merging checkpoints, etc
- NTFS would take minutes to do the same operation as it is actually zeroing out blocks instead of metadata
- This is referred to as the Sparse VDL technology
- Windows Server 2016 contains the new block cloning technology found in ReFS
One of the huge drawbacks with ReFS with Windows Server 2016 was the fact that you couldn’t use deduplication. However, with Windows Server 2019, deduplication is now a fully supported feature of the ReFS file system! This offers tremendously huge space savings benefits to Hyper-V storage.
Hyper-V environments have arguably some of the most “duplicated” files and resources of any type of environment. According to Microsoft, customers can look to see as much as 10 times the space savings in Hyper-V environments using ReFS deduplication.
How to Enable ReFS in Windows Server 2019
Since ReFS is a file system format for a volume, you can use standard tools such as the Disk Management utility or the diskpart command line utility to enable it.
Below is the New Simple Volume Wizard that appears when creating a new volume.
On the Format Partition screen, you can select ReFS under the Format this volume with the following settings.
Additionally, as mentioned, you can use the diskpart command line utility to format a new volume with ReFS as well.
Windows Server 2019 Hyper-V ReFS Benefits and Considerations
We have spoken to many of the benefits in the earlier sections, however, ReFS offers tremendous advantages when related to Hyper-V, in terms of performance, reliability, and capabilities. When virtualization admins think of the tremendous advantages to using ReFS for virtualization tasks it certainly is worth consideration.
- The Sparse VDL technology that allows quickly provisioning virtual disks is exciting. This means you can provision Fixed Disk “thick provisioned” disks in seconds and no longer minutes
- Block cloning technology provides tremendous benefits in terms of Checkpoint merge operations that with ReFS are only working with the metadata on disk instead of actual blocks of data. This allows for “lightning fast” checkpoint merge operations as well as much more efficient use of disk I/O. Other file operations are more efficient with ReFS
- With Windows Server 2019 Hyper-V ReFS volumes, deduplication can now be enabled and result in those tremendous space savings we have already mentioned a few times
Microsoft is certainly removing the roadblocks to using ReFS, especially with Hyper-V, now that it can be configured with deduplication. Additionally, few organizations want to be “early adopters” with brand new technologies. Brand new certainly describes ReFS running on Windows Server 2016. Now with Windows Server 2019, it has a few miles of road behind it and has developed a bit of maturity.
There are still a few points to keep in mind with ReFS when thinking about best practices and performance.
One might assume for Windows Server 2016 Hyper-V Cluster Shared Volumes or CSV, it would be advisable to use ReFS. However, this is not the case. ReFS CSVs always runs in file system redirection mode which sends all I/O over the cluster network to the coordinator node for the volume.
In deployments utilizing NAS or SAN, this can dramatically impact CSV performance. When CSVs are used you want to always make use of NTFS as the preferred file system in production environments in this configuration. Microsoft has not really clarified or changed this direction at the time of this writing, so it appears, for now, this is still the guidance with Windows Server 2019.
ReFS is definitely the recommended file system to use for Windows Server 2016 or 2019 Hyper-V running Storage Spaces Direct or S2D since the required use of RDMA network cards does not utilize the file redirection mode found in ReFS CSVs.
Windows Server 2019 Hyper-V ReFS Deduplication has come a long way since its introduction in Windows Server 2016. It has matured as a technology and Microsoft has removed major roadblocks to using the technology such as the lack of deduplication. ReFS in Windows Server 2019 now supports deduplication and provides an extremely effective use case for Hyper-V environments running VDI or other highly duplicated virtual environments.
There is no question that Windows Server 2019 Hyper-V ReFS deduplication will pave the way for a larger number of adopters running ReFS technology backing business-critical production workloads.