Building the Open Cloud: Part 5

Virtual I/O is a cornerstone of today’s cloud solutions. The previous posts in this series have discussed why virtual I/O is essential to the cloud: resources must be interchangeable and scalable, and virtual I/O provides the any-to-any connectivity that makes this possible.

But what are your options for virtual I/O? And what are the relative merits of the different solutions? That’s the subject of this post.

The Three Types of Virtual I/O

Today there are three distinct virtual I/O technologies on the market. Three years ago, there were none. This is a sure sign that something important is going on here. Visionaries from multiple camps saw the need coming and the R&D dollars followed.

Before we get to the three types, a quick clarification here: there are actually more than three types of I/O management solutions on the market. But for the purposes of this discussion I’m looking only at the fully virtualized I/O solutions that incorporate virtualized Ethernet and virtualized Fibre Channel connections.

Some solutions can only re-map existing physical I/O. While this provides some management flexibility, it is limiting. Re-mapping existing I/O gives you no option to change the I/O mix down the road (if, for example, you later need more Fibre Channel ports).

Consequently, I’m only looking at fully virtualized solutions. This narrows the crowd to these three types:

  • FCoE solutions
  • PCI – E solutions
  • Xsigo virtual I/O

To understand the distinctions between these solutions, a good place to start is to look at what they have in common.

The Common Elements of Virtual I/O

All virtual I/O types now on the market share these attributes:

1) A physical card in the server: All virtual I/O solutions today incorporate a host adapter of some kind in each server.

2) An interconnect: A link between the host card and the consolidation point.

3) A consolidation device: An external device that all the servers connect to. The networks and storage connect to this device rather than to each server individually.

4) Virtual NICs and HBAs: These are software resources, just like virtual machines. To the Hypervisor and OS, they look like conventional NICs and HBAs.

All virtual I/O solutions share these four elements. The nomenclature varies, and you’ll also hear the terms “unified,” “converged,” and “fabric” used, as well as a host of new acronyms (CNA, HCA, SR-IOV, MR-IOV, DCE, etc.).

What’s critical here is that all of the solutions provide some level of “any to any” connectivity and dynamic I/O management, which is what’s essential for the cloud.

Those are the solution similarities, but there are important differences as well. Directly comparing the solutions is the simplest way to reveal those differences.

Xsigo Compared with FCoE

FCoE (Fibre Channel over Ethernet) was created by the traditional I/O vendors (Cisco, Qlogic, Emulex) to converge server I/O. A new Ethernet variant was created to allow FC traffic (which does not tolerate packet re-ordering) to move over Ethernet (which does allow re-ordering). We can compare FCoE and Xsigo details to see the differences in the solutions.

Host Cards: FCoE employs host adapters called “converged network adapters” (CNAs) that essentially combine a Fibre Channel HBA and an Ethernet adapter on one card. Xsigo employs a card called a “host channel adapter” (HCA) that provides a conduit from the PCI bus to the consolidation point.

Here’s why the Xsigo approach has some compelling advantages.

  • Flexible functionality: A CNA is a conventional host adapter; it has fixed functionality (specific Ethernet and FC capabilities). With Xsigo’s HCA, nothing is fixed. The card only moves information from the PCI bus to the I/O Director, which is where all of the I/O intelligence resides (on hot-swappable I/O modules).
  • Higher performance: Xsigo’s HCA is up to 4X the bandwidth of a CNA (40Gb vs 10Gb), and incurs about 1/3 the latency.
  • More compatibility: Most server makers offer the HCAs used by Xsigo. Dell, HP, IBM, Hitachi, and SuperMicro all offer the Xsigo-interoperable parts for both blades and rack-mount servers. CNAs, on the other hand, differ from vendor to vendor, and even within each vendor’s product line.
  • Lower cost: Xsigo’s HCA is about 1/2 the cost of a CNA.

Consolidation Point: Xsigo’s consolidation point is the “I/O Director,” a purpose-built device designed specifically for this role. As such, it’s highly configurable. It can accommodate new types of I/O (if you need 40Gb Ethernet down the road, just plug it in), and can also accommodate a large amount of uplink capacity.

The Xsigo I/O Director was also designed for I/O isolation. When a virtual NIC is assigned to a specific port, data from that vNIC is accessible only at the port. From the perspective of isolation, it is the same as having a dedicated cable from that port to the virtual NIC.

With FCoE, the consolidation point is a switch, and is designed like a switch. There is limited uplink capacity and limited flexibility to change I/O capabilities.

QoS: Quality of service is a critical element of the cloud. It lets you guarantee and regulate bandwidth to applications. Critical apps can be ensured the bandwidth they need, while non-critical apps can be prevented from hogging the road.

To do this, Xsigo employs hardware-enforced QoS (it is controlled at the I/O Director and thus demands no server resources). You can set committed and peak information rate parameters to specific vNICs and vHBAs, so you know exactly who’s getting what.

FCoE, on the other hand, uses “priority queuing.” This means you can only prioritize one type of traffic over another (all FC traffic is the same, by the way, and is given the highest priority to ensure that re-ordering does not occur). It does not let you determine which application gets what bandwidth.

I/O Virtualization – External  vs. Internal

There is a fundamental difference between Xsigo and FCoE that is worth highlighting: where the I/O intelligence resides.

With Xsigo, the I/O intelligence lies outside the server in the I/O Director. The server’s I/O card is simply a conduit to the I/O Director. With FCoE, the intelligence remains inside the server, on the converged network adapter. That card is an intelligent, fixed asset.

This matters for three reasons:

Cost: The CNA is fundamentally more complex and therefore more expensive (and more power hungry). CNAs are about 2X the price of Xsigo’s cards, and are 1/4 to 1/2 the performance.

Management: A “cloud” is ideally a pool of fully interchangeable assets. To accomplish this, all servers need the capability to connect to all resources (this is not the same as saying they are connected to all resources all the time). Furthermore, for simplicity, I/O should be uniformly managed across all servers.

With Xsigo, this is easy. Because the I/O intelligence is shared by all servers, I/O access and I/O management is the same for all servers.

FCoE is different. For several reasons, a pool of FCoE-equipped servers deployed over time are likely to end up being different from each other. First, FCoE cards will vary from vendor to vendor, as vendors jockey for competitive advantage. (Note for example that Cisco blades will certainly be different from Dell and HP blades, since Cisco has stated they will not support their switch in these competing solutions.)

Second, FCoE cards will change over time as features are added. FCoE cards are now in their 3rd generation since inception. The servers with cards made in 2011 will likely be different from those made in 2010.

To be fully interchangeable, your cloud would have to be entirely composed of one generation of product from one vendor (not likely!). More likely, you’ll end up managing resources in silos, or you’ll manage everything uniformly at the “lowest common denominator” level.

Investment Protection: I/O technologies change over time. Recent and ongoing transitions include 4G FC, 8G FC, 10G Ethernet, 40G Ethernet, and FCoE. Since you can’t buy a converged network adapter that supports every I/O type you’ll encounter over the life of the server (3 to 5 years), you’ll probably end up with some servers that support new I/O types and some that don’t.

Xsigo’s external I/O virtualization solves this problem. Simply adding an I/O module to the I/O Director makes a new I/O type instantly accessible to all attached servers. With Xsigo, every server can access the same resources, has the same level of I/O technology, and can be managed the same way.

Xsigo Compared with PCI-E

Several new vendors now offer I/O virtualization solutions based on PCI-E interconnects. Like Xsigo, this is an external I/O virtualization approach where the server’s I/O bus is extended to an I/O consolidation device. In this approach, that device houses off-the-shelf PCI host adapter cards. These cards are then shared across multiple servers.

While this approach does offer some useful benefits, it has limitations that become problematic in enterprise deployments.

Scalability: PCI-E is limited to a specific number of nodes (today, solutions go up to about 32 total devices connected, servers + I/O cards combined). Xsigo employs a fabric that is proven to scale beyond 2,000 nodes.

Centralized management: With PCI-E solutions, I/O is managed server-by-server. With Xsigo, I/O is managed centrally from one location.

Performance: PCI-E solutions use off-the-shelf cards that were not designed to be shared across multiple servers. Vendors get around this with “time-slicing” (cards are allocated to different servers for fixed periods of time). While this is fine for low traffic volume, it will become inefficient under heavy traffic conditions because bandwidth is allocated by the assigned time slice, not by the actual requirements. Say for example you have five servers with equal time slices. If one of them is experiencing demand while the other four are idle, each will still receive its assigned time slice, rather than allocating the bandwidth to the busy server where it is needed. Bandwidth capability is then wasted.

By contrast, Xsigo’s modules are designed and optimized for sharing. Users have found, for example, that Xsigo’s virtual HBAs are often faster than traditional hardware HBAs when multiple hosts simultaneously access a storage device. This is because Xsigo’s queuing algorithms were designed for exactly this scenario.

Quality of Service: Aside from the time slicing already discussed, there is no QoS control in PCI-E solutions. Xsigo’s QoS is designed into Xsigo’s purpose-built I/O cards, regulates actual bandwidth, and is hardware enforced.

Blade server support: Not all blade makers offer the modules needed for PCI-E. With Xsigo, all blade vendors (except one!) offer the needed cards and switches.

Xsigo: The Infrastructure for the Cloud

The bottom line is that Xsigo was ideally designed to support the requirements of the cloud: scalability, flexibility, availability, and performance. And Xsigo executed that design with 100% open standard components. Investment protection and manageability are built in, thus ensuring simplicity and long-term value.

In the next and last post in this series I’ll highlight a few actual examples of cloud deployments.

Tags: ,

Leave a Reply