Sponsored: Storage under Pressure
Analysis Blade servers, virtualization software and fancy accelerators might be all the rage in the server business, but Google doesn't want any part of the hype.
Google will continue crafting its own low-cost, relatively low-performing boxes to support its software-over-the-wire plans. The ad broker looks to focus on lowering energy costs, improving its parallelized code and boosting component life spans rather than messing with things such as VMware and GPGPUs (general purpose GPUs). So, those of you buying into the software as a service idea might want to have a think about Google's contrarian approach when the likes of HP, IBM, Sun Microsystems and Dell come hawking their latest and greatest kit.
Okay, sure, basing your data center designs on Google's whims might not be the most practical course of action. Google builds new data centers at an astonishing pace and works on a scale seen by only the largest service providers.
"Our applications don't run on anything smaller than a data warehouse," said Google engineer Luiz André Barroso, while speaking last week at the Usenix event in Santa Clara, California.
By data warehouse, Barroso means a facility with software spread across thousands of systems. Google has announced at least four such $600m systems in the last few months just in the US. This type of scale has Google working on software problems, energy issues and component conundrums beyond the realm of conception for most companies. For that reason, Google has largely bypassed the Tier 1 server and software vendors' pitches.
For example, Barroso noted that he "loves" the people at VMware but doesn't plan to use their software.
"I think it will be very sad if we need to use virtualization," he said. "It is hard to claim we will never use it, but we don't really use it today."
Instead, Google relies on maintaining tight control over its entire software infrastructure from the OS level on up to the applications and management packages. It's constantly fine-tuning these systems to create a data warehouse that almost asks as a single, massive virtualized system.
And rather than waiting for ISVs to write multi-threaded code for dual- and quad-core chips out there, Google has decided to do much of the work on its own.
"We might be one of the few companies on the planet that throws software away and writes it from scratch," Barroso said.
Along those lines, Google recently acquired PeakStream - a start-up dedicated to improving the performance of software written in a single-threaded model on multi-core processors such as GPGPUs. Google sidestepped right past the GPGPU technology to have the PeakStream crew focus instead on improving Google's existing code.
"Our problems don't fit (GPGPUs) today," Barroso said.
Google is ready and willing to spend any amount of money to wring the last bit of extra performance out of its data centers. This attitude comes from a company struggling to keep AdWords, Gmail and YouTube powered. It also seems to come from a company that has hopes of sending even more software - say a client OS and all the accompanying applications - over the wire in the future, as far as we can tell.
Few, if any, vendors will be willing to match Google dollar for dollar in this race. Although, according to Barroso, Google's current experiments could end up as standard computing models. A PC in ten years "might have similar problems as a warehouse computer does today," Barroso said, noting the silicon makers' push to create processors with tens and even hundreds of cores.
In addition, Google's current battles with power consumption might lead to a new model for selling servers.
"If power keeps increasing with performance, you'll end up where the power bill will cost more over the lifetime of the machine than the cost of what you paid for the server," Barroso said. "That is not good for anyone.
"It could result in interesting business models where you buy a server the way you buy your cell phone today. . . It's not unreasonable to think that you sign a contract with PG&E, and they will just give you the server if you agree to pay a certain amount for power over five years."
While it used to dispose of data center metrics after a few weeks, Google now collects and keeps just about every tidbit of information it can on system and component performance. The ad broker studies obvious areas such as system utilization and power consumption. It also digs into more unique items such as monitoring the health of certain types of disk drives over their lifetimes. As part of this process, Google examines drives from various manufactures running at a wide range of speeds.
Oddly, Google has found that running drives "a little bit warmer" actually seems to improve their reliability. The company also discovered that it's near impossible to predict when a particular drive in a system will likely fail, but it can figure out the "failure rates of an entire drive population".
Such knowledge can provide Google with a major advantage over competitors. (Your company would probably be quite well served by buying drives from Google's preferred vendor.)
Google speaks about these data center issues with the same passion as a company such as Sun or HP. In fact, Google has such a solid handle on the data warehouse concept that we wonder if it won't out-pace Sun, HP and IBM at delivering massive scale centers that other companies can tap to distribute their software.
Take, for example, HP. It's own internal IT transition will see the company move from more than 80 data centers to six centers in three locations. Part of this shift will result in HP running close to 50 per cent of its operations on the company's own blade servers. HP also plans to use a huge amount of virtual machines.
HP tells the world this because it wants to sell you blade servers and virtual machines. "If we're buying this, so should you."
That message must resonate with HP's traditional customers, but will it prove appealing to the web services crowd that's meant to be taking over the world?
You're told that future data centers will tolerate failures with ease, adjust to spikes in demand on-the-fly and support the software as a service model where code comes running out of your, er, internet tap. And, yet, isn't it Google and not the Tier 1s that seems closest to pulling off this vision?
Barroso's speech hasn't changed our life, and it shouldn't change your near-term data center plans. It has, however, made us think a bit harder about virtualization software, blades and the hyped like. Will such products really provide the basis for the utility computing model proposed by all the major hardware and software vendors? We're thinking not.
Google may look like an over-customized, one-off shop at the moment. But as data explodes, chip cores multiply, hardware costs fall and energy costs rise, it may well end up as the unavoidable standard.
You can find Barroso's presentations here. ®
Sponsored: Creating the Storage Advantage