Chapter 12. Clustering Technologies
Clusters work to provide fault tolerance to a group of systems so that the services they provide are always availableor are at least unavailable for the least possible amount of time. Clusters also provide a single public-facing presence for a set of systems, which means end users and others who take advantage of the resources cluster members provide aren't aware that the cluster comprises more than one machine. They see only a single, unified presence on the network. The dirty work of spreading the load among multiple machines is done behind the scenes by clustering software.
Microsoft provides two distinct types of clustering with Windows Server 2003:
Network load-balancing (NLB) clusters
These types of clusters allow for the high availability of services that rely on the TCP/IP protocol. You can have up to 32 machines running any edition of Windows Server 2003 and Windows 2000 Server (with one minor exception, covered later in this chapter) participating in an NLB cluster.
True server clusters
Server clusters are the "premium" variety of highly available machines and consist of servers that can share workloads and processes across all members of the cluster (with some exceptions, as you'll see later in this chapter). Failed members of the cluster are automatically detected and the work being performed on them is moved to other, functional members of the cluster. True server clusters are supported in only the Enterprise and Datacenter editions of Windows Server 2003.
Where might each type of clusters be useful? For one, NLB is a very inexpensive way to achieve TCP/IP high availability for servers that run web services or other intranet or Internet applications. In effect, NLB acts as a balancer, distributing the load equally among multiple machines running their own, independent, isolated copies of IIS. NLB only protects against a server going offline, in that if a copy of IIS on a machine fails, the load will be redistributed among the other servers in the NLB cluster. Dynamic web pages that maintain sessions don't receive much benefit from this type of clustering because members of the cluster are running independent, unconnected versions of IIS and therefore cannot continue sessions created on other machines. However, much web content is static, and some implementations of dynamic web sites do not use sessions. Thus, chances are that NLB can improve the reliability of a site in production. Other services that can take advantage of NLB are IP-based applications such as FTP and VPN.
If you have business-critical applications that must be available at all times, true server clustering is a better fit for that type of use. In true server clusters, all members of the cluster are aware of all the other members' shared resources. The members also maintain a "heartbeat" pulse to monitor the availability of services on their fellow members' machines. In the event of a resource or machine failure, the Windows Server 2003 clustering service can automatically hand off jobs, processes, and sessions begun on one machine to another machine. That isn't to say this swapping is completely transparent. When the application is moved or fails to another member in the cluster, client sessions are actually broken and reestablished on the new owner of the resources. Although this happens relatively quickly, depending on the nature of your application it probably will not go unnoticed by your users. Often, your clients could be asked to reauthenticate to the new cluster owner. However, the cluster effectively acts as one unit and is completely fault-tolerant, and if you design the structure of your cluster correctly, you can avoid any one single point of failure. This decreases the chance that a single failed hardware or software component will bring your entire business-critical application to its knees.
In this chapter, I'll deal with each type of clustering individually, introducing concepts and showing you how to accomplish the most common administrative tasks.
|