Interview Should you use flash solid state drive (SSD) storage as a pretend hard disk drive or as a cache attached to a server’s main bus?
These are two approaches that have emerged about using flash in large scale storage applications. EMC, with the help of STEC, says to use drop-in Fibre-Channel-attached SSDs, which function like very, very fast Fibre Channel hard drives as a small but significant tier zero of storage in large disk drive arrays. IBM, with the help of Fusion-io, thinks you should provide a PCI-e link to separate SSD storage as demonstrated in Project Quicksilver with 4TB of Fusion flash.
Because of its PCI-e bus connect Fusion-io has been thought of as server-accelerating flash and not storage array flash. Not so, says Rick White, one of the three founders of Fusion-io and its chief marketing officer. In the Quicksilver project the flash is a storage array but is connected to an IBM x server's bus. The x server functions, in effect, as a storage array controller.
In this interview Rick White sets out Fusion-io’s approach in the server-vs-storage and SSD connection areas. His replies have been edited to bring out what we think are the main points.
El Reg: What are the issues hard drive storage array vendors should consider when thinking about flash-enabling their storage arrays?
Rick White: The cost of NAND flash is the same no matter where it is deployed in the storage infrastructure. What is different is how effectively it is utilized, and the cost of connecting it in.
Making NAND flash connect up like disks do, behind buses and protocols designed for slow mechanical disks, simply wastes much of the medium’s benefit and increases the cost of connecting it in. Putting NAND flash more directly on the PCIe bus, on the other hand, reduces cost and enhances the capabilities inherent to NAND flash, regardless of whether that's the PCIe bus of a server or the PCIe bus of a storage array appliance (again, see IBM's Project Quicksilver).
El Reg: How would you compare and contrast SSD-accelerating servers and storage arrays?
Rick White: It's not about accelerating servers vs. accelerating storage arrays. It is about putting the NAND flash as close as possible to the bus that is common to both and through which the data must flow anyway. Today that bus is PCIe.
What people are missing is that, inside of all modern storage array infrastructures, there is a PCIe bus that moves data on / off of FC to / from the DRAM caches. Placing NAND flash directly off that same bus is the best answer.
Indeed the difference between server acceleration and storage array acceleration really goes away when one realizes that storage array infrastructure is actually itself made up of servers turned into appliances. These appliances use the same commodity, off-the-shelf Intel/AMD processors, DRAM, PCIe, FC HBAs, etc. (that is also true for EMC and NetApp appliances). IBM drove this point home with Project Quicksilver, actually pointing out that they used standard X series servers as their SVC appliances.
It's even more startling to note that with the performance and capacity density offered by NAND, the differentiation between a server as a consumer of storage and a server as a supplier of storage simply becomes a question of the software used to export that storage from the box, and how much storage is in the box. NAND gets enough performance density in a standard server to rival specialized storage appliances, where these appliances have to beef up the CPU's memory and PCIe buses to get the throughput.
El Reg: How should HDD-based storage arrays be flash-enabled, either by adding tier zero flash like Symmetrix or by adding a separate SSD enclosure/array like IBM's Quicksilver?
Rick White: The beauty of the IBM approach is that there were no additional enclosures. The SSDs went directly inside the storage array appliances (which were just X-series servers). The PCIe-attached ioDrives could just as easily be placed inside of the existing EMC array appliances, as they too have PCI.
I'd say directly attaching NAND on the PCIe bus can easily be considered a superior approach. The strategy requires no additional enclosure and offers higher performance at a significantly lower cost. This is especially true when considered in relation to the fact that it's solid state and doesn't need access…
El Reg: What do you think of the issue that server operating systems' disk I/O pattern is ill-suited to SSD use? Can any disadvantages be worked around and how?
Rick White: Yes, while most SSDs have a difficult time with random writes (or even sequential writes for that matter), the ioDrive excels in this regard. Again, by not trying to force NAND flash to look like a disk where it doesn't matter (to the sheet metal, SCSI / SATA bus & controller), but only where it does matter (to the OS and software), the ioDrive provides greater degrees of freedom in solving this problem.
To put it more simply, by allowing NAND to have a HW interface that accentuates its strengths, one can better cover the weaknesses of the medium and have it simultaneously appear as traditional block storage to the OS. To back this up, I think a single ioDrive that can do over 500 MB/s write speeds on Microsoft's Vista operating system should answer the question on whether an advanced controller design can work around existing disadvantages or bottlenecks in an operating system.
El Reg: What is Fusion-io's take on MLC flash and also on write endurance cycle lengthening?
Rick White: MLC will play an important role in enterprise SSDs. Its endurance can be extended through various means, but for many vendors it will still be somewhat limited to non-write-intensive applications. Even so, the cost of MLC is just too compelling to not use it, especially by those who can bound their write-load.
El Reg: What is Fusion-io's view on HDD storage array I/O internal and external I/O structures and how it can cope with SSD I/O patterns? What does this mean for storage array vendors?
Rick White: Object-based storage, which allows for content-aware storage media handling, will greatly enhance SSD performance. This ultimately means even tighter software/hardware integration, which is favoured by directly PCIe attached HW; that includes software drivers that can convey this information.
Fusion's ioDrive is actually an Object Storage device that can run in a simplified mode to support traditional dumb block storage. Enhancements to operating systems and file systems to further exploit its object storage capabilities are underway.
Fusion says to use flash as a large cache directly attached to a storage controlling x86 server's main bus. Put this in a SAN so that it becomes just another piece of shared block-level storage. This PCIe-connect idea certainly seems viable, with IBM’s Quicksilver demonstrating just that. It does mean though that vendors can’t use an approach of dropping 3.5-inch form factor SSDs and controllers into, say, Symmetrix and Clariion hard disk drive enclosures, as EMC has done. White doesn't see this as an issue.
It will be very interesting to see if any storage array vendor picks up this idea and brings out an array with disk drives connected by a Fibre Channel Arbitrated Loop to a main x86 server/controller and a separate stash of Fusion-io SSDs connected to that controller’s PCIe bus. Balancing I/O inside the overall array and providing a fast enough I/O pipe to it might be interesting challenges. ®