SQL Failover Clusters and You

SQ failover cluster

Failing over nodes with a cluster is a common configuration for high priority application servers. In general, failover clusters group interconnected nodes with centralized software management. The software directs redundant services, and in the event of a failed node automatically fails over application processing to a working node within the cluster.

The ideal is to achieve minimal data loss with fast failover, although in practice busy applications might lose some transactions during failover. This configuration is also known as a high-availability cluster, and replaces single points of failure like single application servers.

SQL Server Cluster Failover

A SQL Server failover cluster provides high availability in SQL Server environments. Failover cluster services create virtual SQL server instances across the cluster. In the event of a downed node, or in scheduled response to threshold events such as slow-downs or spiking IOPs, services will alert administrators and automatically failover the server instances to redundant nodes. Ideally there will be no data loss during the failover, but this largely depends on how many transactions your SQL Server is processing. The goal is at least minimal data loss.

SQL Server failover also requires shared storage. This is an issue in the cloud but there are additional services available that support cloud-based replication and cluster integration. In any event, stored data must be exposed to the cluster. This is assuming that your shared storage is available and redundant as well; build in availability to your storage as well as the SQL Server nodes.

For example, Microsoft uses failover cluster instances (FCI) as part of its SQL Server Always On offering. MS Windows Server Failover Clustering (WSFC) creates server-level FCIs across participating nodes. The FCI looks to the network like a single server host. But upon node failure, the WSFC coordinates the failover response and restarts the instance on a working node. This fast operation minimizes data loss in heavily transactional environments.

SQL failover clusters offer four node configurations: active/active, active/passive, N+M, and N+1. Active/active configures two active nodes, each hosting multiple VMs. They share resources between them and each node can failover to the other. Active/passive is a primary/secondary node configuration where one node is the primary host for the VMs, and the secondary node serves as a standby server for a failed primary node.

N+M high availability failover clusters have at least two active nodes for primary processing, and at least two secondary nodes for failover standby nodes. N+1 has at least two active nodes that share the same standby node.

Note that none of these configurations are limited to a single local cluster. You can configure geographically separate SQL Server clusters within the cloud, within your private cloud, or between your private cloud to a cloud provider.

Creating SQL Server failover clusters is not a walk in the park, but it is a best practice for protecting high-priority SQL Server transactions. Consider making the job easier by contracting with cloud providers who are expert at SQL Server cluster failover. The cloud has distinct advantages such as letting you install failover services for distributed databases.