Flash is the new disk and disk is the new archive: so said an AOL presenter at Brocade's Techday summit. What was he smoking? Is this just spin?
I believe not. I believe we are seeing the dawning of a new storage age, one in which flash is used to store primary, tier one data; capacity disk is used to hold secondary and backup data; and tape once again becomes the archive data backstop. A simple trifecta of a scheme.
The outline of this trifecta is drawn from signals we have noted from companies at the storage frontier:
- EMC has slaughtered file TPC benchmarks with flash-heavy VNX systems. Huawei-Symantec riposted with a controller-heavy disk system but the direction is clear. No supplier wanting to head these rankings will submit a disk-based system anymore.
- The SPC-1 rankings are going to be dominated by flash systems, both in a performance sense and in a price/performance sense with outrageously impressive advances in price/performance by flash systems.
It simply will not matter how much disk, or how many spindles you throw at an I/O-intensive storage workload. Flash can service I/Os an order of magnitude faster and do so more cheaply. Seek and ye shall ... No, you've got it wrong. You don't need to seek any more in a world where primary data doesn't need to wait to be accessed by a disk read/write head.
For bulk data storage in the 100s of TB area, flash cannot match disk for cost. Even if 2-bit multi-level cell flash gives way to 3-bit, the rate of disk areal density increase will continue to give disk an advantage. Seagate has just announced terabyte 3.5-inch platter technology and can soon deliver a 3-platter, 3TB drive and, if it cares to do so, a 5TB, 5-platter drive.
The way forward is clear. For enterprise primary data, flash will be the medium with everything else on disk – either SAS or SATA 10K and 7.2K rpm disk. There will be no need for faster drives. Whatever residual need there is will be for small businesses that don't need or can't afford a flash premium, and enterprise apps in a similar situation. Hybrid drives, ones with flash caches, will be a higher performance option, for disks where both capacity and speed are needed.
At the other end of the spectrum, disk will be used for longer and longer-term data storage, especially where virtualised IT environments need fast movement of data between locations in and between data centres, as in cloud computing schemes. But disk, constantly online disk, is always at risk of propagated data errors: witness the Googlemail episode.
The only thing that brought Google out of the hole it was in was a set of Oracle StreamLine 8500 tape libraries and LTO-5 tape. Offline but ready to use in massive tape libraries, tape is the ultimate backstop. It is the data centre's lifebelt and lifeboat. Without it, when the data loss/data corruption storm strikes, you are sunk. It's that simple.
Tape is as glamorous as a box of tissues. But when you need them then, boy, you really need them.
Tape's cost/GB stored blows disk away. Tape's reliability, with today's media and pre-emptive media integrity-checking library software is far higher than disk. Tape cartridges don't crash. Tape cartridges aren't spinning all the time, drawing electricity constantly, vibrating themselves slowly to death, generating heat that has to be chilled, and – most importantly – are not always online, always susceptible to lightning-quick data over-writing by dud data or file deletion.
Tape is cheap, safe and reliable and there is no substitute, except if you're prepared to gamble that your disks will be safe. Ask Google how that would have played out. Ask Amazon why it lost data in its outage and why VMware did the same. Tape is the shelter you need in tornado alley – and every data centre with disk drive arrays is located in tornado alley.
These are the three tales from the storage frontier: flash for speed; disk for bulk secondary and backup data; tape for the archival backstop. That's the coming storage trifecta we're seeing. ®