EMC's Symmetrix V-Max, announced today, represents the death of monolithic storage arrays and the rise of combined scale-up/scale-out modular arrays built from commodity components tied together with ASICs, very clever software and geographical distance cluster interconnects. It's also a tribute to 3PAR's T-Class InServ design and solidifies a trend to modularity seen already with the DMX-4.
The basic V-Max building block is a pair of quad-core Xeon 2.3GHx (5400) processors, 16 host and 16 disk enclosure ports (8 each per Xeon quad-core), 128GB of global memory, an EMC ASIC to handle the global memory access, and the RapidIO interconnect endpoints.
EMC can now ride the Intel CPU evolution curve and we can look forward to Nehalem-based V-Max engines, roughly twice as powerful as the existing ones at the quad-core level. (V-Max Plus anybody?) These should plug in and play with the existing 5400 Xeon-based ones providing both a scale-up upgrade path and investment protection.
We know that V-Max can scale to 8 V-Max engines with the current release and that FAST software is coming to provide automated data movement between the V-Max storage tiers (flash disk - Fibre Channel disk - SATA disk).
Barry Burke, EMC's senior director and chief strategy officer for the Symmetrix Product Group, says that EMC is developing software to leverage flash drives as something more than just another tier of storage, but doesn't explain what that means. Certainly, in the future, we might see 2-tier V-Max arrays; ones with just a fast flash tier and a bulk storage SATA tier.
V-Max memory considerations
The 8-engine V-Max has eight sets of storage enclosures behind the engines combined into a single logical pool of storage capacity, with the storage processors enjoying the use of global memory made up from each engine's own 128GB local memory, and presumably combined and kept coherent by the EMC ASICs. Any access to memory by a V-Max processor is treated as a local access, and remote accesses are virtualised to seem local.
As well as enabling two kinds of memory access, one local and the other remote to a peer V-Max engine, the architectural design allows for a third kind of memory access and linkage. Burke describes it: "the Architecture allows for a third dimension of interconnect – a connection between different V-Max systems. This interconnect would not necessarily expand to share memory across all the nodes in two (or more) separate V-Max arrays, but it would allow multiple V-Max arrays to perform high-speed data transfers and even redirected I/O requests between different Symmetrix V-Max 'virtual partitions.'"
"This capability of the Architecture will be leveraged in the future to 'federate' different generations of V-Max arrays in order to scale to even greater capacities and performance, and will also be used to simplify technology refreshes. In the future, you’ll be able to “federate” a new V-Max with the one on your floor and non-disruptively relocate workloads, data and host I/O ports."
It seems to me that V-Max virtual partitions could have different characteristics. Data in a partition might be single instanced and compressed Celerra-style, for example.
V-Max uses an interconnect new to most of us - RapidIO. Burke states: "the first generation of the Symmetrix V-Max uses two active-active, non-blocking, serial RapidIO v1.3-compliant private networks as the inter-node Virtual Matrix Interconnect, which supports up to 2.5GB/sec full-duplex data transfer per connection – each 'director' has 2, and thus each 'engine' has 4 connections in the first-gen V-Max."
Why was RapidIO used and not InfiniBand? After all, RapidIO has evolved from bus technology whereas InfiniBand is already used in processor clusters and storage clusters (Isilon). Sun has just chosen Mellanox' 40Gbit/s InfiniBand ConnectX adapters and InfiniScale IV switch silicon for its new line of Sun Blade modular systems and Sun Datacenter Switches.
Burke says RapidIO was selected for its " non-blocking, low latency, high bandwidth, parallelism and cost efficiency – RapidIO has been used in a broad range of embedded applications from MRI systems to military fighter jets." He adds: "the Virtual Matrix Architecture doesn’t limit the fabric to being 2 RapidIOs; it could be 4 or 8 RapidIO networks running in parallel, or it could be built on a different infrastructure altogether – InfiniBand, FCoE/DCE (Data Centre Ethernet) – or whatever comes along in the coming years."
Geographic V-Max clusters
This 8-controller plus ASIC idea with back-end storage aggregated into a single virtual pool is reminiscent of 3PAR's T-class Inserv array. It even resembles a CX-4 Clariion controller with a virtual matrix backend according to one storage commentator. That would also play to the notion that V-Max is monolithic Symmetrix storage re-invented in a modular storage style.
Of course it should go way beyond that. We understand that the V-Max architecture can cope with 256 engines - 256 modular array components - and that these can be, or rather will be able to be, linked across geographical distances. That means that the global memory can be kept coherent across geographic distances and so can the single virtual storage pool.
It means in turn that data, such as virtual machine (VM) files and VM app data can be moved, or will be moved, automatically by the FAST software from one part of this geographically tightly-coupled storage cluster to another. This will work for both VMware and for Hyper-V. The idea of virtually instantaneous disaster recovery and optimising workload performance across hundreds of storage processors, across geographic distances, is very attractive.
Another aspect is that, from a virtualised data centre point of view, V-Max looks like storage that can be managed by and for virtualised servers, along with virtualised networks. The three virtualised elements of a virtual data centre - servers, storage and networking - will be able to work together for and be managed together by virtual data centre software. (There is a VMware announcement coming on April 21st, which will probably expand on this.)
A question is whether the RapidIO switches that are currently available can cope with 256 V-Max engines. These switches collectively form a fabric with a backplane that provides the very high bandwidth and low latency interconnect glue enabling the separate V-Max engines to function as one. Increasing the number of engines sending messages across that backplane by a factor of 32 would greatly increase the packet load. Can RapidIO cope?
InfiniBand is looking set to have virtual segments created and it may be that RapidIO may go in the same direction. Alternatively, a different protocol might be used in future, with RapidIO bridges, as Burke alludes to above. It's understood that V-Max can have a different interconnect protocol applied if, say, the InfiniBand or 10gigE suppliers come up with a better one. The details of the interconnect are not visible to the software running in the V-Max engine processors so slotting in a new interconnect should not affect the upper layers of the V-Max architecture.
With EMC seeing that physical backplane designs simply cannot scale enough to support the potentially thousands of VMs running in a virtual data centre, it seems intuitively obvious that HDS and IBM will come to the same conclusion.
Their USP-V and DS8000 architectures will have to evolve to modularity too. Interestingly, NetApp is radically enhancing its storage array clustering capability with ONTAP 8.0 expected in a month or two, and has already said additional high-end hardware systems are on its roadmap.
We might see the top four storage array suppliers - EMC, HDS, IBM and NetApp (HP and Sun OEM HDS USP-V arrays) - all transitioning to scale-up/scale-out modular architectures at the high end of their product lines. ®