Monthly Archives: November 2013

The need for Change Block Tracking to perform Differential backups of Hyper-V VMs

Hyper-V has been gaining momentum and with Hyper-V 2012 supporting SMB 3.0 based NAS storage, this momentum is likely to accelerate. And of course, any commercial deployment needs to have a proper backup policy. This blog examines some simple, but often overlooked problems in Hyper-V backup, which of course, all backup vendors have tackled in some way or the other.

In general, there are two ways to do a backup with VMs

  1. Backup from within a VM
  2. Backup from the hypervisor aka Hyper-V parent partition

In some particular cases, only of these choices is feasible. Figure 1 shows a VM that uses an iSCSI LUN that is passed through directly to the VM.


After the advent of the VHDX file format and the associated performance and robustness improvements, there are even less reasons to use the configuration depicted in Figure 1. However, in this configuration, the only way to backup is by running a backup application within the VM. The speed boost you get by eliminating the NTFS stack within the parent partition is marginal, and live migration of a VM with this configuration involves a LUN transfer when the iSCSI volume needs to be moved to a different Hyper-V host.


Figure 2 shows a more typical configuration with a VM using a VHD(x) file (VHD or VHDX). Figure 2 shows this VM being backed up from within the VM, even though other choices exist. The main drawback here is that if the Hyper-V host were running 20 such VMs, one would have to pay the cost of 20 Backup App licenses.


Figure 3 shows a VM again using a VHD(x) file, but backup being performed from the Hyper-V parent partition. This is a popular configuration since the cost of the Backup App can be amortized over all the VMs being hosted. The BackUp App depends upon the VSS infrastructure Microsoft has created that runs in both the Hyper-V parent partition and inside the VM. Of course, if the VM is running an OS where no VSS IC requestor exists, this configuration is not feasible.

Once a snapshot is created and a fullback is done, the full backup will include a complete copy of the VHD(x) file. Given that the VHD(x) file will be at least 10s of GBs as in ranging anywhere from 20GB to 100GB or more, it is highly desirable that the subsequent backups be differential backups which only backup changed data within the VHD(x) file.


And that is where the problem lies. None of Windows 2012, Windows 2012 R2, or Hyper-V 2012 provide a facility to determine the changed blocks within the VHD(x) file.

An ideal solution would install in the Hyper-V parent partition and would install, uninstall, load, unload without requiring a reboot of the Hyper-V parent partition. The VMs would have to be restarted for the change tracking to work. This is shown in Figure 5.


Any ISV or OEM looking for such a generic solution is encourage to contact me via LinkedIn.

How SMB 3 Witness Protocol detects failure without any timeouts

The SMB 3 protocol that first shipped with Windows Server 2012 (and Windows 8) is remarkable for making Network Attached Storage (NAS) comparable, and in some senses, even superior to Direct Attached Storage (DAS). NAS is now almost as fast as DAS when used without hardware acceleration. When used with hardware acceleration using the sister protocol SMB Direct also referred to as RDMA, the speed can be even higher! Further, SMB 3.0 based NAS is as reliable since it provides detection of node failures and failover of open file handles (without invalidating the handle), all within a matter of 5 seconds or less. See Jose Barreto blogs for descriptions of SMB Direct and SMB Multi Channel that emphasize the speed aspects of SMB 3.0.

Given that SMB timeouts are of the order of 40 seconds, and TCP timeouts are also of a similar order of time, SMB 3.0 cannot reply upon timeouts to detect failures. This blog explains the basics of how the Witness Protocol works in conjunction with SMB 3.0 to achieve the required failure detection and failover.

This blog provides an overview and is NOT aimed at a developer audience since some technical details are skipped.

It all starts with an SMB 3 client connecting to an SMB 3 clustered file server as shown in Diagram 1


The client notices the highly available share and using the Witness Protocol (which is RPC based), requests the node to which it connected for data path access to return a list of IP addresses for each cluster node running the Witness Protocol Service. This is shown in Diagram 2.


As shown in diagram 3, the server responds with a list of IP addresses for all cluster nodes running the Witness Service Protocol. The protocol allows for returning both IPv4 and IPv6 addresses.


The client receives this information and registers a notification with one of the cluster nodes other than Node A, with which it is already connected to consume data via SMB 3.0. The idea is that the cluster nodes will be running a cluster quorum protocol, whatever it is, and hence the cluster nodes B, C, D (in this example) will notice if and when node A becomes unavailable. This is shown in Diagram 4.


Now imagine that node A becomes unavailable for some reason as shown in Diagram 5. The exact reason is immaterial. It could be a power failure or network failure or a system crash or some other reason.


Node B (and also C and D) notice that node A is unavailable via the cluster quorum protocol running within the cluster. Node B (in this example) issues an RPC callback to the client notifying it that Node A is unavailable.


The client then performs an SMB Session Setup, Tree Connect etc to any one of the other remaining nodes. In Diagram 7 in this example, the client connected to Node C.


Note that the “client” can itself be another server e.g. the client could be a SQL server or an IIS server.