The 10-Year Anniversary of InfiniBand

The 10-year anniversary of IB was celebrated this week at the SC09 conference. I am often asked about I/O fabric technologies for connecting servers to external I/O controllers, and this milestone is a great opportunity to look at why InfiniBand succeeds in this role.  

When I/O adapters are internal to the server, the ubiquitous technology is of course PCI Express (PCIe). But when the I/O is external to the server, the choices become more diverse. To be a viable candidate, the external I/O fabric needs to inherit some of the properties and capabilities of the internal bus. And it must augment them with the ability to share the same devices across multiple servers, and the ability to scale the I/O connectivity to a large number of servers.

These are some imposing requirements, and they put very specific demands on an external I/O fabric.

The list of required attributes isn’t that long, but they are all essential:

  • Reliable data delivery
  • Direct Memory Access (DMA): Direct transfer of data between server memory and I/O controllers
  • Flow control
  • High throughput
  • Low latency

Now, what about the extra capabilities needed when external I/O controllers must be shared across servers? Here’s a list:

  • Scalability: Ability to support large data centers with many servers and multiple I/O systems. Being able to maintain a high ratio of servers to high-capacity I/O systems is clearly attractive for CapEx and ease of management reasons.
  • Network forwarding based on endpoint addresses: This is a corollary of the scalability requirement. Networks with endpoint addressing can scale to a very large number of nodes using switches. Memory-addressed fabric technologies, such as PCIe, were not designed to scale in this manner. PCIe Advanced Switching (also known as ASI) was an attempt to layer PCIe on top of another network technology with endpoint addressing, but it was abandoned by the industry in 2006 due to excessive complexity and other issues.
  • Sharing: It must be possible for multiple servers to share the same I/O controller. This is a basic requirement for achieving consolidation and virtualization.
  • Quality of Service: Ability to implement quality of service for different traffic flows.

InfiniBand (IB) is an interconnect technology which satisfies both lists of requirements very well. This is not just a lucky coincidence. The development of IB was motivated exactly by the need to develop a next-generation I/O fabric which would enable the large-scale connectivity of servers to external I/O systems.

I have had the good fortune of taking part in the development of IB-based products throughout the entire decade this amazing technology has existed. It has been quite a journey from its inception to its current mature stage where it provides the interconnect for most of the top 100 supercomputer clusters in the world, and where it has mature hardware devices as well as software stacks.

IB is now used for every imaginable demanding application from car crash simulations to nuclear simulations, from financial programmed trading to oil exploration to database clustering. It is not surprising that it conquered so many applications with its 40Gb/s server connectivity, 120Gb/s switch-to-switch links, 60 nanosec switch hop latency, 1 microsec end-to-end latency, remote DMA capabilities, reliable transport protocol, and efficient network topology management.

At Xsigo, we have been using IB as the I/O fabric technology for connecting servers to the I/O modules of the I/O Director. It has served as a perfect fit for this application by fully satisfying all the requirements listed above. The system is designed in such a way that the I/O fabric performs its role behind the scenes; the user manages virtual I/O devices such as virtual NICs and HBAs, not an IB fabric. This is analogous to the case of internal I/O where the user manages NICs and HBAs, not the PCIe bus which moves the bits between the CPU/memory and the I/O controllers.

IB technology is an area of continued innovation and development. IB has progressed from SDR to DDR to QDR speeds, switch port densities have increased, and very attractive cost/performance ratios have been achieved. Stable driver and protocol implementations are available for all major operating systems and hypervisors, and most major server vendors support the technology. There is a thriving community of IB software developers within the OpenFabrics Alliance and elsewhere. It has indeed been an exciting decade for this innovative technology, and its vitality and influence will continue growing as it is adopted in a larger variety of applications, including data center I/O virtualization.

Tags:

One Response to “The 10-Year Anniversary of InfiniBand”

  1. [...] The 10-​​Year Anniversary of InfiniBand « Xsigo Virtual I/​O Blog — Infiniband — the real data centre net­work­ing pro­tocol — turns ten years old. It is worth not­ing that they recently Infiniband vendors announced 120 GB/​s speed (Quad Rate) and with laten­cies of 200ns. Ten times faster than Ethernet both in speed and latency. [...]

Leave a Reply