17th November 2008

Top 500 supers: Big Blue Roadrunner outpaces Jaguar

Petaflops abound

SC08 The Supercomputing 2008 trade show kicked off this past weekend, and the centerpiece of the annual event, which is being hosted in Austin, Texas, is the Top 500 ranking of supercomputers that comes out twice a year. This time around, Cray's Jaguar has tried to catch IBM's Roadrunner and has come up with feathers in its mouth. Maybe Cray should have nicknamed it Wile E. Coyote?

Both machines are rated above 1 petaflops of number-crunching performance, and a whole bunch of other supers on the list are moving into the quadrillion ops arena. IBM's Roadrunner machine, which is installed at the Los Alamos National Laboratory, one of the handful of labs operated by the U.S. Department of Energy, was upgraded over the summer, thereby denying Cray's Jaguar bragging rights. But not by much.

The Roadrunner machine, which is a hybrid Opteron-Cell box custom blade box made by IBM, was upgraded a bit to push its performance to 1.105 sustained petaflops, up from 1.026 petaflops on the June 2008 list. Roadrunner was the first machine to crack the petaflops barrier. Beep, beep!

But Cray's Jaguar XT5 Opteron-based massively parallel super, installed at another DOE site at Oak Ridge National Laboratory, got more upgrades this cycle and was able to boost sustained performance from 205 teraflops using XT4 frames and 2.1 GHz quad-core Opterons to a much larger XT5 machine rated at 1.059 sustained petaflops.

As we go to press, the feeds and speeds of these two boxes in the Top 500 list, as well as the other 498 machines, are not yet available. Starting with the June 2008 list, machines were also rated for their power consumption, making it possible to rank the supers based on how many flops per watt they deliver. Another interesting game to play with the Top 500 list is to compare sustained performance on the Linpack test to peak performance, giving a sense of the efficiency and true cost of a particular super and its architecture.

The Top 500 list is put together twice a year to provide the feeds and speeds of the fastest 500 supercomputers in the world. Erich Strohmaier and Horst Simon, computer scientists at Lawrence Berkeley National Laboratory, Jack Dongarra of the University of Tennessee, and Hans Meuer of the University of Manheim make the list, which is based on the Linpack Fortran benchmark test created by Dongarra and colleagues Jim Bunch, Cleve Moler, and Pete Stewart back in the 1970s to gauge the relative performance of computers of all stripes and sizes on numerical calculations. The official Top 500 list came out in 1993, and this November 2008 compilation is the 32nd edition of the list.

Cray, which has been struggling financially and adversely impacted by the delay in the "Budapest" quad-core Opterons from Advanced Micro Devices that are used in the XT line of supers, is obviously very happy to be at the top of the list and in contention for the title. Cray may bear the name of the venerable maker of supercomputers from the 1970s, but it is an amalgam of supercomputer makers and architectures that are being converged to make Cray a credible alternative to IBM for custom supers here in the States. (The U.S. government likes to have at least two indigenous suppliers of high-end gear, more if possible, and has funded a substantial amount of the research behind super designs from IBM, Cray, and Silicon Graphics).

SGI has also in recent years moved away from its Itanium-based, global memory Altix 4700 supercomputers (kickers to its old NUMA-MIPS machines) and towards its Altix ICE blade designs, which are based on x64 processors. SGI wants to be in the game of supplying big iron to government labs and agencies. And NASA's Ames Research Center is SGI's sugar daddy, as it has been for a long time, and in the November 2008 ranking, the "Pleides" Altix ICE super has just squeaked by the former leader on the list, a BlueGene/L Linux-Power machine, to take the fourth spot in the list.

The SGI Pleides machine was ranked at 487 teraflops of sustained performance, pushing the BlueGene/L box at Lawrence Livermore National Laboratory, rated at 478.2 teraflops, down to the fourth slot in the list. Number five on the list is also an IBM box - also installed at a U.S. government lab - the BlueGene/P kicker to BlueGene/L, which is running at the DOE's Argonne National Laboratory and which is rated at 450.3 teraflops after some upgrading during the summer months.

The local favorite

The "Ranger" super installed at the University of Texas in Austin is one of the local favorites on the list, and it comes in at number six. The installation of Ranger is one of the bragging rights boxes that Sun Microsystems has put into the field after being an also-ran in supercomputing for many years. It also vindicates Sun's substantial investment in InfiniBand switching and blade technology. Ranger was upgraded since June, too, and now the X6420 blades in the box deliver 433.2 teraflops of sustained performance, up from 326 teraflops in the prior list.

Cray will also be pointing out that it has a new XT5 super installed at Lawrence Berkeley National Lab, rated at 266.3 sustained teraflops and giving it the number seven box on the list, as well as another XT4 at Oak Ridge rated at 205 teraflops, and the mother of all these machines, the "Red Storm" Opteron-based parallel super at Sandia National Laboratories, rated at 204.2 teraflops, coming in at number eight and nine on the list.

Of course, the top-end machines on the Top 500 list are relatively exotic compared to the bottom three-quarters of the list. As has been the case for a long time now, the x86 and now x64 architecture reins supreme across the list, just as it does (at least in terms of box count and aggregate performance) in general purpose computing. So it is no surprise that 370 systems on the November 2008 supers list use Intel's processors, down a smidgen from the 375 machines using Intel chips in the June 2006 ranking. (Most of these are x64 processors, but there are quite a number of Itanium-based machines.)

Another 59 machines use AMD's Opteron processors, and 60 machines use one or another variant of IBM's PowerPC or Power processors. These machine counts are relatively unchanged from the June 2008 list. So while there has been plenty of churn in machine rankings in that short period of time, architectural choices are holding steady.

Not at all surprisingly, multicore processors have now become the norm in systems ranked in the Top 500 list. There are 153 systems using dual-core chips and 336 machines using quad-core chips under their skins. There are only four machines still using single-core processors, and nine machines are using IBM's hybrid nine-core, Power-derived Cell processor - Roadrunner being the dominant one.

One of the reasons why Hewlett-Packard bought Compaq eight years ago is that Compaq had a pretty decent supercomputing business thanks to its acquisition of Digital Equipment and its dominant position in rack-based servers. HP, which had previously bought Convex in 1995, a maker of vector minisupers that gave HP a lot of the system tech that eventually made its way into the HP 9000 V-class and Superdome machines, had a slice of the super biz as well. And while HP doesn't get the flashy deals at the top of the Top 500 list, it has 209 systems on the list (41.8 per cent of machines) compared to IBM's 188 machines (37.6 per cent of the total).

Last time around, IBM had more machines than HP. But thanks to all those big Power or Opteron-Power boxes, as well as a fair number of x64 machines, IBM has the largest share of the installed processing capacity on the Top 500 list. When you consider that flops equals dollars, this is what would seem to matter most. Then again, the Top 500 super list is a lot of free and broad public relations and marketing, isn't it?

The aggregate performance of the Top 500 list has grown quite a bit since June, up 44.9 per cent to 16.95 petaflops from the June 2008 list and almost triple the 6.97 petaflops of installed capacity in the November 2007 list. It probably will not be all that long before we have a multi-petaflops system installed somewhere that has more oomph than the entire 500-strong list from a year ago. To even get on the list this time around, a machine had to have 12.64 teraflops of power.

The Top 500 list is not restricted to the United States, of course. But with 291 machines out of the 500, the U.S. dominates the list. In the June list, 257 machines were based in America. This time around, European countries accounted for only 151 systems, down from 184 boxes. The United Kingdom had 45 machines (down from 53 supers on the June list), and Germany dropped off fast with only 24 machines, down from 46 boxes in the June list. The Asia/Pacific region's count on the list stayed the same this time around, at 47 machines. Japan had 18 boxes on the November 2008 supers list (down from 22), China had 16 machines (up from 12), and India had 8 systems (up from 6). ®

