The Cisco/EMC/VMware/Intel lovechild VCE has a simple schtick: the boxed-up rigs of hardware and software it sells are sold in configurations that have been documented and tested to the last detail. As the company told us by email “we commit to delivering Systems that have been engineered, tested and certified as one.”
VCE is also big on support, and just about promises to chopper in ninja sysadmins within moments of even a single management console light going from green to red.
But the plan isn't always working, according to one VCE customer. The customer bought a vBlock running VCE's Release Certification Matrix version 4.07. The Release Certification Matrix (RCM) is VCE's twice-yearly frozen bundle of hardware, software and documentation.
Included in the bundle was version 2.2.2-17 (SP2) of the software that runs the EMC XtremIO array that shipped with the vBlock. That software contains a flaw that duly struck at this user's vBlock, with the result that “our entire array went down hard. It took VCE/EMC 6+ hours to get it back online (in a degraded state,) and around 8 hours to restore full functionality.”
The user, who chose to remain anonymous but has posted an account of their experiences on Reddit and shared more details in an exchange with The Register, accepts that this can happen.
“I think any Vblock customer will tell you that they understand that you lose a certain degree of flexibility when it comes to responding to code releases under the RCM model,” they wrote. “That is a trade-off many are willing to make for the increased confidence of all the components 'playing nice' with one another.
VCE is up-front about this. In correspondence with The Reg a company spokesperson told us that “When new components / code are brought to market, VCE aims to supporting code refreshes within 45 days.” Between that 45-day gap and the six month time between RCM refreshes “it is possible that a Vblock will ship with code that has been superseded.”
VCE will, however, scramble to implement and support security patches, which it promises to release “within 1-2 weeks of the known vulnerability.”
The user's complaint isn't about that gap, but feels EMC may have dropped the ball because it “admitted that there had been no public advisory/ETA on the issue.”
Without any information, even partner VCE struggled to address the issues.
“My gripe with VCE in this case is more about their communication with EMC. EMC knew about the bug, but did VCE?,” the user said.
“That is a question that I have not gotten an answer to as of yet. It's one thing to have an issue and discover that you've encountered an unknown bug, but it's quite another to be down and then find out that you've run into a wall that was known to be there,” the user writes, concluding that “I'm hoping that this can be a learning experience for them, and that they can collaborate more closely during the certification process in the future.”
There are probably lessons for other stack-in-a-box players here, too, and even for those who offer or adopt reference architectures, as most offer long-ish refresh times for their published configurations. ®