Marathon Technologies, the clustering spinout founded by some ex-DECers back in 1993, continues to tweak its everRun line of high-availability and fault-tolerant clustering products for x64 physical and virtual servers. And with the new everRun MX release, the fault tolerance for XenServer-based virtual machines can now span machines with multiple cores and multiple processors rather than confining a virtual machine to a single processor core.
According to Jim Welch, the ex-IBMer tapped to be Marathon's president and CEO a year ago, this is a big deal. While the high-availability clustering software for systems sold under the everRun HA name could run on machines with multiple processors (symmetric multiprocessing) and with multicore processors, the everRun FT fault tolerant clustering of the XenServer hypervisor running on a pair of machines has been limited to providing fault tolerance on a single core at a time.
Welch didn't want to divulge to El Reg what the "major breakthrough" is for the new everRun MX clustering technology, but the lockstepping of two processors requires a lot of communication and checkpointing between the nodes, and as you move to multicore processors and/or SMP systems for hosting the hypervisor, the traffic supporting the fault tolerance grows exponentially and quickly swamps the system.
VMware's HA feature for its vSphere 4.0 and 4.1 stacks are also limited to failover for a single core for this reason, and according to Welch, VMware had hoped to have an SMP-capable HA feature for the ESX Server hypervisor to market.
Welch says that it has taken Marathon eight years to come up with a new technique, and that the company had its breakthrough two years ago, which it is now commercializing as everRun MX.
This seems like a good reason for VMware to buy privately held Marathon and get its hand on the secret sauce before Citrix Systems does. And that is all the more reason for Citrix Systems to buy Marathon before VMware makes its move.
With everRun MX, Marathon can provide fault-tolerant lockstepping of virtual machines that scales up to eight processor cores per virtual machine and supports multiple VMs per physical box. everRun MX supports only the XenServer hypervisor, which Marathon OEMs from Citrix Systems. Neither Hyper-V from Microsoft nor ESX Server from VMware have been supported with the earlier single-core everRun FT clustering at the VM level because Hyper-V and ESX Server are closed-source products. Marathon could have partnered with Microsoft and VMware to gain access to source code and provide FT clustering for multicore and SMP VMs, of course. But that kind of partnership is a pain in the neck, and Citrix is eager to partner with anyone to help in its war against VMware in the server and desktop virtualization rackets.
With the breakthrough that Marathon has made with everRun MX, Welch says that it no longer needs the source code to Hyper-V or ESX Server to provide FT lockstepping. "We had to get deep into the bowels of the hypervisor and the operating system with the older everRun architecture, but we don't have to do this any more," says Welch. So, in theory, Hyper-V and ESX Server support could be coming soon if enough customers demand it.
The everRun MX software has been in alpha testing since February and has been beta testing for a few months, according to Welch. EverRun MX requires a pair of x64-based servers, each with 4GB of main memory and Intel processors with the VT virtualization electronics baked into the chips inside each server. The servers can use local direct-attached storage or access their files from an iSCSI or Fibre Channel storage area network. The availability link between the two machines, which keeps the VMs running atop the XenServer hypervisor, is comprised of four Gigabit Ethernet ports, so you don't need to go crazy with InfiniBand or 10 Gigabit Ethernet.
At the moment, everRun MX is certified to support only Windows guests inside of the virtual machine partitions. Windows Server 2003 SP2, Windows Server 2008, and Windows Server 2008 R2 are all supported, in either 32-bit or 64-bit versions. The everRun MX software presents a single operating environment (with one IP address and one MAC address) to the outside world, which means end users don't have to change anything in terms of their network connections to applications in the event of a crash that wipes out a VM on one machine. They just get routed to the second machine, which is already lockstepping the workload, and nobody misses a beat.
The everRun MX FT clustering software for virtual machines comes in two flavors, and rather than pricing them based on features, Marathon is pricing them based on the size of the company. So the Standard Edition includes a license for two x64-based servers (each with up to eight processor sockets) at $10,000 for companies with 2,000 employees or fewer. If you have more than 2,000 employees, you have to buy the Enterprise Edition, which costs $15,000 for a pair of servers.
The thinking behind this pricing is that, in general, larger companies use more feature and use bigger systems and therefore should pay more. Marathon, however, doesn't want to have the Standard Edition be some sort of crippleware version of the Enterprise Edition. It wants all customers to have all features.
Customers probably won't react well to this pricing and will almost certainly demand the lower price, despite the logic. Marathon could just price the code based on the number of sockets in the box: $5,000 for a pair of machines with two sockets each, $10,000 for a pair with four sockets each, and $15,000 for a pair with eight sockets each. This seems more fair.
Marathon emerged from bankruptcy in 2003 and has raised $27.8m in funding since that time. Its second round of funding was announced earlier this year, bringing in $13.5m that came in two rounds (one in August 2009 and another bit in February 2010). Welch says that since February the company has added more than 500 customers — the base is now over 3,000 unique companies — and now has over 14,000 distinct implementations of its various everRun products in the field. ®