finally, I can point to a comparison someone else ran

Have a look at this:

root@cisd-ruapehu # time dd if=/dev/zero of=80G bs=8192k count=10000
10000+0 records in
10000+0 records out
real    2m41.484s
user    0m0.064s
sys     2m26.890s

Quick calculation. This is a 44 disk raidz2 striped in the “optimal” manner according to the guide quoted. This is roughly 80GB in 161 seconds. Or, 0.497 GB/s. Under 500 GB/s. Yup.
They show off their blazing 560 MB/s performance on smaller (mostly cached 16GB system ram 20GB file) files.
Now I have independent confirmation of how much faster our 16 bay 3U JackRabbit is than this, as well as our 24 bay 4U unit. Don’t even need to get to the 48 bay 5U unit.
Took a snapshot (pdf) just in case this goes into a memory hole.
Our 3U unit sports about 750+ MB/s on the same test at well under 1/2 the price. Our 24 bay 3U unit sustains 1.6 GB/s on the same test with about the same capacity, at under 1/2 the list price of this unit.
We could put 10k or 15k RPM SAS drives in JackRabbit, but that would be just plain old cruel of us.
Part of the reason I bring any of this up is that we often get the “2GB/s” number thrown at us in discussions. Customers know about this number. We ask them (practically beg them) to test it. We often hear … well … rather different numbers back. Sadly, we can’t use any of them as they aren’t in the public domain. These numbers are though, and hopefully they won’t go down the memory hole.

3 thoughts on “finally, I can point to a comparison someone else ran”

  1. Hi, they only get ~500 MB/s when writing to the ZFS pool because they appear to run an old Solaris 10u4 (see: “Live upgrade – Fully explained here: […]”). ZFS checksum and parity calculations only became multithreaded in Solaris Nevada b79:
    500 MB/s is consistent with fletcher2 (default ZFS checksum algorithm) bottlenecking on one of the ~2GHz Opteron cores of the Thumper. So really you want to run b79+ (or Solaris 10u6 – due out very soon) to fully exploit the 4 cores of a Thumper.
    Also, with ZFS raidz2 you have dual-parity + checksums by default, whereas JackRabbit’s RAID6 only does dual-parity (no checksums – ie. it is unable to self-heal itself when the dual-parity detects a corruption). Because this feature doesn’t exist in JackRabbit, you should disable checksums in ZFS for a fair benchmark between the 2.
    -jim (happy user of ZFS at home :P)

  2. @Jim
    Thanks for the note. I have in the past (though I had to delete the public presentation of the results) benchmarked Solaris vs Linux on one of our boxen. We are about to do this again with the latest OpenSolaris drop, and we will have a new box to test as well.
    We turned off checksumming, ZIL and many other things. Zfs was not close to ext3 in performance, and definitely not close to xfs or jfs performance.
    My main concern reallly was testing the underlying hardware. We keep running into marketing numbers which aren’t well aligned with peoples measurements/experience. We haven’t been able to find xfs (or ext3) data on x4500 or x4560 online. Ext3 is just not an appropriate large file system, so I generally care less about that, but it is a useful data point.
    OTOH, we have heard from many sources that what they measure on the unit isn’t even remotely close to the marketing numbers. Since we report what we measure, we can directly compare our units performance to what we find. Turning off the things that people suggest doesn’t suddenly pull an extra 500-1000 MB/s out of a hat.
    There is a significant performance benefit to the tightly coupled approach we take with JackRabbit. This is what we were looking to compare.

  3. Regarding the “2 GB/s” sometimes quoted in Thumper marketing material, I know that they reached that throughput with striping on top of 46 drives (2 drives reserved for mirroring for the root filesystem). Actually I remember even 3 GB/s being quoted. If one day you get your hands on a Thumper, try it 🙂 zpool create mypool disk1 disk2 … diskN.
    (Of course pretty much nobody would run a 46-drive striped setup in production because the probability of loosing a drive and the entire pool is very important. Only rare cases where the data is easy to regenerate would make sense for such a setup).

Comments are closed.