I’m taking the vSphere Design Workshop class this week, and during class one of my classmates mentioned a scenario he encountered that I hadn’t heard of. The situation involves using VM Monitoring as part of VMware HA and performing guest maintenance.
VM Monitoring is a feature of HA that will automatically restart virtual machines when the VMware Tools heartbeat is lost for a certain period of time. Normally the heartbeat isn’t lost unless something bad has happened to the guest, like a blue screen or other crash. In that case HA will restart the VM in an attempt to get it back online.
The scenario that was brought up in class involved this feature combined with guest maintenance. Suppose you need to reboot your VM and keep it running but not booted into the OS for an extended period of time. Things like booting from an ISO to use imaging software, booting from a GParted ISO to resize a partition, or even going into a VM’s BIOS are valid examples of when you might do this.
As you might expect, when the VM is booted from the ISO there are no VMware Tools heartbeats and so HA detects this as a failure and restarts the VM. I confirmed this behavior in my lab with a test VM by booting with a Windows 2008 R2 ISO and letting it sit at the install screen. Sure enough after 30 seconds or so the VM rebooted and I saw the following event for this VM (the screenshot is a nice touch):
This virtual machine reset by HA. Reason: VMware Tools heartbeat failure. A screenshot is saved at /vmfs/volumes/493973c5-a2392745-74e6-001d0-97282e4/TestVM/TestVM-screenshot-.png
Seems like an obvious thing but it might take some people by surprise. At best it’s just an annoying distraction or at worst the reboot interrupts some kind of system repair that causes damage to the guest. There are configurable values to determine how long HA will allow a VM to stay up without receiving heartbeats as well as how many times HA will restart the VM when it has no heartbeats. So this might not always affect everyone but could still be an issue if you’re not careful.
The moral of the story – if you’re using VM Monitoring with HA, remember to temporarily disable it if you need to do guest maintenance in which the VM will be down for an extended period of time.