Fujitsu has got itself a 50PB-plus scale-out array for very big data, the CD10000.
Properly named the ETERNUS CD10000 Hyperscale Storage System, the box uses Ceph, open source software that presents file, block and object storage from a distributed cluster of object storage nodes across which data is striped. Theoretically, Ceph can scale up to 1 exabyte, so there is lots of capacity headroom.
More ReadingFujitsu fleshes out Ceph hyperscale CD10000 with cost-saving tweaksTintri gazes hungrily at Japan after nailing Fujitsu OEM dealSpeaking in Tech: Is OpenStack really letting the SAN shine?Fujitsu boss sets CDOs against CIOs at annual doNHS XP patch scratch leaves patient records wide open to HACKERS
Hardware-wise, the CD10K supports 224 nodes which live in standard 19-inch racks and communicate via dual 40Gbit/s InfiniBand links. There are three types of node:
- Basic node – with 2 Xeon CPUs and 12.6TB raw capacity using 16 x 900GB 2.5-inch 10,000rpm SAS disk drives and a PCIe SSD for caching, journalling and metadata.
- Capacity node – with 252.6TB of raw capacity from 60 x 3.5-inch 7,200rpm SATA disk drives plus 14 x 900GB SAS drives.
- Performance node – with 34.2TB of raw capacity using 10K rpm SAS 2.5-inch spindles and PCIe SSDs.
Front-end access is via a 10GbitE LAN and its resources can be accessed via KVM, Swift and S3.
The system’s usable capacity depends upon the number of data replicas, two or three for example, set up to protect against data loss.
Ceph doesn’t use RAID which is a good thing with around 13,400 disk drives in a 224-node system configured for capacity, as RAID rebuilds from disk failures would be constantly taking place. Replication is used instead and the system self-heals and has zero downtime according to Fujitsu. The protection level can be increased by increasing the number of replicas, from one to two to three or four.
Fujitsu says the array can last a long, long time as components are refreshed with newer technology; new nodes can be swapped in with old nodes remaining or being retired. The software takes care of moving data around to use the new nodes.
The firm says customers get the open source advantages of low cost, fast development and anti-locking advantages while having a single throat to choke for end-to-end maintenance and support.
The company says it is working with partners and customers to add system applications to the CD10K such as cloud services, file sync ’n share, archiving and iRODS data discovery.
Fujitsu ETERNUS CD1000
This is a heavyweight big data system for customers with vast data repositories such as CSPs, universities, financial and public institutions with cloud and cloud-scale projects, telcos and enterprises adopting OpenStack – Ceph being the core object storage in OpenStack.
Fujitsu says the CD10K can be a single aggregating silo for block, file and object data and that it is making Ceph usable by enterprises. If you are looking at systems using Ceph, GPFS, Lustre, Gluster or DDN's WOS, then the CD10K should be on your inspection list.
It's possibly one of the most scalable scale-out systems on the planet, certainly out-scaling EMC's Isilon product.