[ Team LiB ] Previous Section Next Section

InfiniBand

At one time, engineers (who are as unlikely to gamble as the parish priest) put their money on InfiniBand Architecture as the likely successor to PCI and PCI-X as the next interconnection standard for servers and, eventually, individual computers. The standard had wide industry backing and proven advantages in moving data around inside computers. Its success seemed all but assured by its mixed parentage that pulled together factions once as friendly as the Montagues and Capulets. But as this is written, some industry insiders believe that the introduction of a new PCI variation, PCI Express, may usurp the role originally reserved for InfiniBand.

InfiniBand Architecture (also known as IBA) marks a major design change. It hardly resembles what's come to be known as an expansion bus. Instead of being loaded with clocks and control signals as well as data links, InfiniBand more resembles a network wire. It's stripped down to nothing but the connections carrying the data signals and, if necessary, some power lines to run peripherals. As with a network connection, InfiniBand packages data into packets. The control signals usually carried over that forest of extra bus connections are put into packets, too. Moreover, InfiniBand has no shared bus but rather operates as a switched fabric. In fact, InfiniBand sounds more like a network than an expansion system. In truth, it's both. And it also operates as an interconnection system that links together motherboard components.

Designed to overcome many of the inherent limitations of PCI and all other bus designs, IBA uses data-switching technology instead of a shared bus. Not only does this design increase the potential speed of each connection because the bandwidth is never shared, it also allows more effective management of individual connected devices and almost unlimited scalability. That is, it has no device limit (as does PCI) but allows you to link together as many peripherals as you want, both as expansion boards inside a computer and as external devices.

As with PCI, InfiniBand starts as an interconnection system to link components together on motherboards. But the specification doesn't stop there. It defines a complete system that includes expansion boards and an entire network.

The initial implementation of InfiniBand operates with a base frequency of 2.5GHz. Because it is a packetized communication system, it suffers from some signaling overhead. As a result, the maximum throughput of an InfiniBand circuit is 250MBps or 2.0Gbps. The speed is limited by necessity. Affordable wiring systems simply cannot handle higher rates.

To achieve better performance, InfiniBand defines a multicircuit communication system. In effect, it allows a sort of parallel channel capable of boosting performance by a factor of 12.

InfiniBand is a radical change for system expansion, as different from PCI as PCI was from ISA. Good as it is, however, don't expect to find an InfiniBand expansion bus in new personal computers. As an expansion bus, it is expensive, designed foremost for servers. You'll find the physical dimensions for InfiniBand expansion in Chapter 30, "Expansion Boards."

As an interconnection system for motherboard circuits, however, InfiniBand will find its way into personal computers, along with the next generation of Intel microprocessors, those that use 64-bit architecture.

History

The shortcomings of PCI architecture were apparent to engineers almost by the time the computer industry adopted the standard. It was quicker than ISA but not all that swift. Within a few years, high-bandwidth applications outgrew the PCI design, resulting in the AGP design for video subsystems (and eventually 2x, 4x, and 8x versions of AGP).

To the engineers working on InfiniBand, PCI-X was only a band-aid to the bandwidth problems suffered by high-powered systems. Most major manufacturers were already aware that PCI-X shared the same shortcomings as PCI, burdened with its heritage of interrupts and hardware-mediated flow control. Increasing the PCI speed exacerbated the problems rather than curing them.

Two separate groups sought to create a radically different expansion system to supercede PCI. One group sought a standard that would be able to link all the devices in a company without limit and without regard to cost. The group working on this initiative called it Future I/O. Another group sought a lower-cost alternative that would break through the barriers of the PCI design inside individual computers. This group called its initiative Next Generation I/O.

On August 31, 2000, the two groups announced they had joined together to work on a single new standard, which became InfiniBand. To create and support the new standard, the two groups formed the InfiniBand Trade Association (ITA) with seven founding members serving as the organization's steering committee. These included Compaq, Dell, Hewlett-Packard, IBM, Intel, Microsoft, and Sun Microsystems. Other companies from the FI/O and NGI/O initiatives joined as sponsoring members, including 3Com, Adaptec, Cisco, Fujitsu-Siemens, Hitachi, Lucent, NEC, and Nortel Networks. On November 30, 2000, the ITA announced that three new members joined—Agilent Technologies Inc., Brocade Communications Systems Inc., and EMC Corporation.

The group released the initial version of the InfiniBand Architecture specification (version 1.0) on October 23, 2000. The current version, 1.0.a, was released on June 19, 2001. The two-volume specification is distributed in electronic form without cost from the InfiniBand Trade Association Web site, www.infinibandta.com.

Communications

In traditional terms, InfiniBand is a serial communication system. Unlike other expansion designs, it is not a bus but rather a switched fabric. That means it weaves together devices that must communicate together, providing each one with a full-bandwidth channel. InfiniBand works more like the telephone system than a traditional bus. A controller routes the high-speed InfiniBand signals to the appropriate device. In technical terms, it is a point-to-point interconnection system.

InfiniBand moves information as packets across its channel. As with other packet-based designs, each block of data contains addressing, control signals, and information. InfiniBand uses a packet design based on the Internet Protocol so that engineers will have an easier time designing links between InfiniBand and external networks (including, of course, the Internet itself).

All InfiniBand connections use the same signaling rate, 2.5GHz. Because of the packet structure and advanced data coding used by the system, this signaling rate amounts to an actual throughput of about 500MBps. The InfiniBand system achieves even higher throughputs by moving to a form of parallel technology, moving its signals through 4 or 12 separate connections simultaneously. Table 9.4 summarizes the three current transfer speeds of the InfiniBand Architecture.

Table 9.4. InfiniBand Peak Throughput Versus Bus Width
Peak Throughput Signaling Rate Wire Pairs
500MBps 2.5GHz 1
2GBps 2.5GHz 4
6GBps 2.5GHz 12

These connections are full-duplex, so devices can transfer information at the same data rate in either direction. The signals are differential (the system uses two wires for each of its signals) to help minimize noise and interference at the high frequencies it uses.

The InfiniBand design allows the system to use copper traces on a printed circuit board like a conventional expansion bus. In addition, the same signaling scheme works on copper wires like a conventional networking system or through optical fiber.

As with modern interconnection designs, InfiniBand uses intelligent controllers that handle most of the work of passing information around so that it requires a minimum of intervention from the host computer and its operating system. The controller packages the data into packets, adds all the necessary routing and control information to each one, and sends them on their way. The host computer or other device need only send raw data to the controller, so it loses a minimal share of its processing power in data transfers.

The InfiniBand design is inherently modular. Its design enables you to add devices up to the limit of a switch or to add more switches to increase its capacity without limit. Individual switches create subnetworks, which exchange data through routers, much like a conventional network.

Structure

InfiniBand is a system area network (SAN), which simply means a network that lives inside a computer system. It can also reach outside the computer to act as a real network. In any of its applications—internal network, expansion system, or true local network—InfiniBand uses the same signaling scheme and same protocol. In fact, it uses Internet Protocol, version 6 (IPv6), the next generation of the protocol currently used by the Internet, for addressing and routing data.

In InfiniBand terminology, a complete IBA system is a network. The individual endpoints that connect to hardware devices, such as microprocessors and output devices, are called nodes. The wiring and other hardware that ties the network together make up the IBA fabric.

The hardware link between a device and the InfiniBand network that makes up a node is a channel adapter. The channel adapter translates the logic signals of the device connected to it into the form that will be passed along the InfiniBand fabric. For example, a channel adapter may convert 32-bit parallel data into a serial data stream spread across four differential channels. It also includes enough intelligence to manage communications, providing functions similar to the handshaking of a conventional serial connection.

InfiniBand uses two types of channel adapters: the Host Channel Adapter (HCA) and the Target Channel Adapter (TCA). As the name implies, the HCA resides in the host computer. More specifically, it is typically built in to the north bridge in the computer's chipset to take full advantage of the host's performance. (Traditionally, Ethernet would link to a system through its south bridge and suffer the speed restrictions of the host's expansion bus circuitry.) The distinction serves only to identify the two ends of a communications channel; the HCA is a microprocessor or similar device. The TCA is an input/output device such as a connection to a storage system. Either the HCA or the TCA can originate and control the flow of data.

The InfiniBand fabric separates the system from conventional expansion buses. Instead of using a bus structure to connect multiple devices into the expansion system, the InfiniBand fabric uses a switch, which sets up a direct channel from one device to another. Switching signals rather than busing them together ensures higher speed—every device has available to it the full bandwidth of the system. The switch design also improves reliability. Because multiple connections are not shared, a problem in one device or channel does not affect others. Based on the address information in the header of each packet, the switch directs the packet toward its destination, a journey that may pass through several switches.

A switch directs packets only within an InfiniBand network (or within a device such as a server). The IPv6 nature of InfiniBand's packets allows easy interfacing of an individual InfiniBand network with external networks (which may, in turn, be linked to other InfiniBand systems). A router serves as the connection between an InfiniBand network and another network.

The physical connections within the InfiniBand network (between channel adapters, switches, and routers) are termed links in IBA parlance. The fabric of an InfiniBand network may also include one or more repeaters that clean up and boost the signals in a link to allow greater range.

In a simple InfiniBand system, two nodes connect through links to a switch. The links and the switch are the fabric of the network. A complex InfiniBand system may involve multiple switches, routers, and repeaters knitted into a far-reaching web.

Standards and Coordination

The InfiniBand specification is managed and maintained by the InfiniBand Trade Association. The latest revision of the specification is available from the following address:

InfiniBand Trade Association

5440 SW Westgate Drive, Suite 217

Portland, OR 97221

Telephone: 503-291-2565

Fax: 503-297-1090

E-mail: administration@infinibandta.org

Web site: www.infinibandta.org

    [ Team LiB ] Previous Section Next Section