Yes, it’s 2012 and we’re still talking about whether or not organizations should consider running a Microsoft Windows Failover Cluster (sometimes referred to as MSCS clustering) in a vSphere environment. I know this topic has been written about before by others but I wanted to share some of my own thoughts and experiences around this topic. My focus these days is helping organizations virtualize their mission critical applications, and in that pursuit the topic of guest clustering comes up often.
What is supported?
To start with, one common misconception is that guest clustering is not supported at all in vSphere. If anyone out there still believes this (and I’ve spoken to many organizations over the years that do), I’d like to state definitively that this is not true. Guest clustering is absolutely supported by both VMware and Microsoft provided you follow the guidelines from both companies in order to maintain support.
One of the best KB articles VMware has released on this subject can be found here. It does a great job of summarizing the various supported configurations and goes into some application specific clustering types as well. I keep this KB article handy and use it frequently in discussions with customers. The following table lists the supported configurations:
I know many folks are very much against the idea of virtualizing a Microsoft cluster and will argue vehemently against its use. I can tell you I am most definitely not in that camp – I believe that the needs and requirements of the business should dictate whether or not clustering is used.
Why would you use guest clustering?
Ok so now you’ve seen what’s supported and after looking at that table you’re probably wondering if it’s even worth it. Is virtualizing a cluster kind of a pain? Yep, you’ll get no argument from me that it’s more difficult. It prevents you from using vMotion or DRS in Fully Automated mode, forces you to use RDMs, and only fiber channel is supported to name a few. The largest restrictions are when using “shared disk” clusters, or clusters that require dedicated storage that is shared amongst the cluster nodes.
With all of these restrictions, why would you use guest clustering in the first place?
Guest Clustering vs. vSphere HA
Let’s be clear about one thing: vSphere HA does not provide the same level of availability as guest clustering. vSphere HA is an awesome feature that can be used in combination with guest clustering, but HA is not application aware and can only protect against hardware and operating system failure.
Application-aware high availability
The biggest reason I see customers using guest clustering is to provide high availability at the application level beyond what native vSphere features can provide. SQL and older versions of Exchange are two commonly clustered guests. In particular I’ve worked with a lot of customers recently who have physical SQL clusters running SQL 2005 and 2008, and they are working towards bringing them into vSphere as part of larger SQL consolidation and SQL as a Service projects.
Guest OS patching
Many organizations still use clusters so that they can patch the underlying OS or application without causing an outage to end users. As covered in the next section, newer versions of Microsoft applications have functionality that can provide this benefit without clustering. I expect we’ll see clusters used just for this purpose decline as applications improve.
For many organizations, the availability that clustering provides is more important than the vSphere features they lose by implementing it. It all comes down to what is important to the business.
Alternatives to guest clustering
Of course there are alternatives to clustering that can provide the same or similar levels of availability without the restrictions. Here are some examples.
Newer versions of Exchange support technologies that do not require shared disks to provide high availability. Exchange 2010 in particular supports Database Availability Groups, which can provide HA down to the database level but does not require any shared disk clustering. That means it can support vMotion, DRS, and all the other great vSphere features. You can read more about the Exchange 2010/vSphere goodness here.
SQL Database Mirroring/SQL 2012
SQL has had database mirroring for years, and mirroring can provide similar levels of availability to clusters without requiring shared storage. And with the release of SQL 2012 earlier this month, Microsoft has improved upon that concept with SQL “AlwaysOn” technology. AlwaysOn technology has some similarities to Database Availability Groups in Exchange 2010 by providing multiple database copies and recoverability at the database level.
With the popularity of storage arrays offering NAS capabilities, the use of file clusters seems to be declining. These arrays have multiple controllers and redundancy across the platform that can provide the same or better availability than a traditional file cluster.
Use In-Guest iSCSI
Do you want to combine the enterprise features of vSphere with the availability of guest clustering? One way to consider doing this is by using in-guest iSCSI to present storage to clustered virtual machines. Using this method gets around the VMware policy of only supporting fiber channel while still allowing features like vMotion to be used. This method is by no means an easy solution – you may need to adjust cluster heartbeat timeouts to allow vMotion to operate successfully, networking at the vSphere level becomes more complicated, using multipathing software is more complicated, vendor support may be more complicated, etc. Again it all comes down to the requirements of the business.
Have I convinced any of you that guest clustering is not the worst thing in the world? I really hope so, because I don’t think we should ever be so closed minded that we immediately dismiss something because it is difficult to implement. I am definitely in the camp of folks who prefer to avoid using guest clustering so that I can take advantage of all the great features in vSphere. But I am most definitely not in the camp of people who dismiss it entirely.
For me, it comes down to what are the needs of the business and how can those needs be met. There are situations where guest clustering is required, and I don’t think we should be telling organizations to keep those servers physical. If you want to virtualize mission critical applications then you should be prepared to consider all possible configurations to meet the needs of the business.