JackRabbit 5U 96TB time trials

By joe

July 12, 2009 - 3 minutes read - 584 words

The raid has finished building on the JackRabbit 5U (JR5) (these units are now available from us or our reseller partners in the US, EU, and India). As a refresher, this is the 96TB unit, with 3 RAID cards, and 48x 2TB enterprise SATA disks. The RAIDs are hardware RAID6 (16 drives, 1 hot spare and 15 RAID drives, yielding 13 data drives). 3 groups of 13x 2TB drives is 78TB. Actual usable capacity in this configuration is a little less due to TB< ->TiB conversion and file system overhead. We see

[root@jr5 ~]# df -h /data
Filesystem            Size  Used Avail Use% Mounted on
/dev/md0               71T  161G   71T   1% /data

Still, 71TB is a 26% reduction in capacity. We have choices to make, to trade capacity for resiliency and performance. We lose effectively 9 drives (18TB) to RAID and redundancy overhead. But we get performance and resiliency for this. It is a reasonable tradeoff. The following are the baseline performance, pre-tuning, we are seeing for this.

streaming write:

[root@jr5 ~]# !126
dd if=/dev/zero of=/data/big.file ...
10240+0 records in
10240+0 records out
171798691840 bytes (172 GB) copied, 94.8258 seconds, 1.8 GB/s

Not bad. I had predicted something higher, but these drives specifically take action to reduce power consumption, and have some dynamic self-tuning capability that will balance power versus performance, vibration, and so on. Now for the read.

[root@jr5 ~]# dd if=/data/big.file of=/dev/null bs=16M iflag=direct
10240+0 records in
10240+0 records out
171798691840 bytes (172 GB) copied, 76.6224 seconds,  2.2 GB/s

I should note that these drives adaptive power and self tuning does work both ways. 2.2 GB/s is about median of what we have seen in testing. The lowest performance comes when it is self-adjusting for vibration, and is in the 1.3-1.5 GB/s region. The highest performance … well … I wish they gave us a way to turn off some of these things. For a while, each raid is cranking out 900+ MB/s. 2.7 GB/s, sustained for about 30 seconds or so, then the auto power and vibration tuning kicks in. More testing in progress. [update] I should note, that we are selling clusters of these and other units, complete with GlusterFS (or other file systems if customers require, such as PVFS2, Lustre, GFS, …). With this density, we are looking at 568 TB USABLE per 41U rack or 639 TB USABLE per 48U rack. 2 racks is 1 PB. Usable. For each rack full, lets use a lower number of 1.5 GB/s per unit. 8 units gets us 12 GB/s, and we can present this out over multiple DDR/QDR IB ports (this is one of the nicer aspects of the design of Gluster and Lustre with the O2IB bits). Even if we could only achieve 50% utilization of this 12 GB/s, this is 6 GB/s per rack (roughly 750 MB/s per box). The 1PB unit would be about 12 GB/s. These numbers are, of course, assuming significant performance lossage. We could follow competitors approaches, and simply quote theoretical (e.g. unattainable/unapproachable) maximums, and draw you into believing you would achieve something you really couldn’t. Then each box in this configuration would be ~2.7 GB/s, 8 boxes would provide 21.6 GB/s, and with QDR to each box, we could provide this out per rack. Two racks would be 43.2 GB/s. Of course this analysis is silly, anyone quoting theoretical maximum numbers is doing their customers a disservice. Quote what you see, what you measure. Its the only thing that actually matters in setting realistic expectations.