Power supplies used by HP ProLiant DL380 G5 rack-mounted servers may fail if they're left dormant for long periods time in certain environments, according to the experience of one customer and comments from HP. This could be a problem for servers that are only powered up for failovers and failover testing.
A reader of The Register experienced this peculiar problem with the power supplies in its Hewlett-Packard ProLiant servers. The ProLiant user, who wants to remain anonymous, co-locates its machinery at a Level 3 data center in London. All of the data for the servers is stored on a storage area network in the data center and is replicated to a backup site in Paris. The SAN replication is hot, but because it doesn't make a lot of sense to keep servers running in Paris doing nothing, this company keeps the machines powered off until a disaster strikes.
Once a year, the company does a failover to the Paris site to test out its disaster recovery procedures, which is exactly what you are supposed to do. (Some companies do rollovers once a quarter these days.) That means firing up the servers and linking them up to the backup SAN in the Paris data center and then testing how the applications and their transactions fared.
Three years ago, this company installed 14 ProLiant DL380 G5 rack-mounted servers in the backup site in Paris. For two years, as part of the disaster recovery test, the company fired up the machines and everything worked hunky dory, with the machines powered up and running workloads for about 10 hours before failing back over to the primary data center in London. But in March of this year, when the company tried to power up the machines – which have redundant power supplies in each of the machines, by the way – 10 out of the 14 servers would not turn on.
This is when the operations manager for the data center started sweating. After dying on hold with HP support and a lengthy conversation, our intrepid El Reg reader was told by the HP support team that the life of a power supply "is severely diminished if you don't keep it powered up." That raised some eyebrows back in London, as you might imagine. "Shocking, to say the least," the company's operations manager told El Reg.
The company had to get someone over to the Paris data center and coordinate for the replacement of the power supplies, which HP did promptly. The disaster recovery procedure will be tested a few weeks from now.
We worked up through the HP channels to try to get some kind of explanation of what the company believed happened here and if other customers had been similarly affected by server power supplies failing after being left dormant for long periods of time.
"HP believes this was an isolated case with this particular power supply," Jim Ganthier, vice president of marketing for HP's Industry Standard Servers division, told El Reg by email. "It's hard to tell what actually caused the issue, but HP believes it was environmental – that the power supplies or the servers were stored in an area of high humidity, water, etc. We have not heard of any other issues with power supplies on any other HP ProLiant DL380 G5s – or similar issues with these servers from other customers. Once we learned about the issue, HP worked with the customer to resolve the issue."
If you have had an issue like this on any kind of server, let us know. ®