Nine months after an engineer accidentally deleted its Amazon-like compute cloud - and six months after a second major outage - FlexiScale has finally completed a software overhaul meant to avoid such extended blackouts.
And if you're interested in building your own infrastructure cloud, it might even sell you this new code base.
In late August, an engineer with the UK-based FlexiScale managed to delete one of the main storage volumes inside the company's high-profile compute cloud - which offers on-demand storage, processing, and network bandwidth a la Amazon Web Services. Customers were earthbound for several days.
Then, in October, a core network failure kept most customers offline for another 24 hours - or more. It's the thought of such outages that keeps many away from the clouds - Amazon has suffered outages of its own - but Flexiscale chief exec Tony Lucas says he has the problem licked.
"The problem we had is that the old code platform wasn't fast enough, wasn't scalable enough, for us to deal with outages," Lucas tells The Reg. "The outage we had in October took us the better part of the day to recover from. Now, we can recover in about 15 minutes."
The old platform was based on Virtual Iron, the Xen-based virtualization manager recently snapped up by Oracle. According to Lucas, VI isn't suited to the sort of scalable, on-demand virtual infrastructure FlexiScale aspires to.
"Virtual Iron did not fit our use case," Lucas says. "For the amount of jobs we were trying to run, the scale we were trying to build it to, it wouldn't work."
In essence, when Flexiscale was forced to restart servers, Virtual Iron couldn't restart more than one at a time. So Lucas and company spent several months rolling their own platform for serving up infrastructure resources via the web.
The new platform - FlexiScale 1.5 - is based on Xen and Linux, and atop this base, the company offers a home-grown management system that oversees switches, storage, and Xen installations. Lucas claims you can now log into the system, create a server, boot the server, and log into the server in about 50 seconds. "We haven't seen anyone get close to that," he says, claiming that Amazon typically requires three to five minutes for a boot and log-in.
And it scales well, he says. He can launch 10 servers in two minutes, and he's confident that the system will eventually get to the point where it can launch 1,000 virtual servers in three or four minutes.
Lucas is so confident, he plans on licensing the new code to other would-be cloud providers. FlexiScale is an outgrowth of hosting company XCalibre, and Lucas sees a business in helping other hosting outfits follow the same road. "Hosting companies have told us that they don't have the skills to build something like this," he says. "And they've started asking us if they can license ours."
Unlike Amazon, the FlexiScale CEO takes a more-the-merrier approach to the infrastructure cloud market, calling for open standards that would allow companies to move applications between competing services. "We've always said that we will endorse the standard that makes the most commercial sense and gets the most ground-swell of support behind it," he says. The company is currently leaning towards the OCCI standards effort launched by the Open Grid Forum. But it has yet to officially sign on.
Like so many other companies, FlexiScale has approached Amazon about facilitating cloud migration, but Amazon prefers to greet such inquiries with silence. "I've tried more times than I can count, and I've given up. They're not interested."
FlexiScale 1.5 has already been rolled out to existing customers, and the new version makes its official debut on Monday. ®