We have been working on testing/benchmarking JackRabbit iSCSI over 10 GbE. Without spilling too many beans, let me describe how our benchmark tests differ from most every one elses, and then I will talk about the performance we get.
Most benchmarks we have seen on iSCSI target the nullio device, or the ram disk. That is, they are benchmarks of the protocol, and have little if anything to do with what you will actually observe for performance. Sadly, this appears to be the norm, as the general concensus is that spinning disks are not fast enough for the protocol.
This is untrue, well, untrue for JackRabbit. Maybe, likely, true for other solutions. As a reminder, we are seeing a sustained 750+ MB/s for 1.3 TB reads and writes to our disks. Now if someone wants to claim they are cached somehow, well, I would like to see 1.3TB of ram in a sub $10,000 box. This unit is, BTW, the fastest performance per dollar that we are aware of in the market at these price points. And not by a few percent, rather, it appears to be quite significant.
Back to the benchmark discussion.
So everyone else is avoiding the committing bits to disks. Well, most everyone else. I think I found one benchmark from users of an HP or Dell system getting ~30 MB/s sustained to their disks over iSCSI.
They ran bonnie++. As I have pointed out in the past, iozone is problematic, as it tests speed of system page cache memory these days, not io speed. More people are starting to recognize these limitations.
One of the better (and very simple) tests for raw observable speed to disk, is the dd command that we had used last time. It doesn’t map well into most workloads, but it pretty much fills up the io channel for you, and tells you what you should expect for your limits to a real file system.
Our tests all used real disk backing store. No ram disk. No nullio. If it isn’t going to disk, the test is not a valid test of the speed of the storage system, as you aren’t, by definition, testing the speed of the storage system if you don’t actually store something on the storage system.
For our 10 GbE tests, our write speed was ~91% of the single thread maximum measured network performance. Our read speed was 85% of the single thread maximum measured network performance. We measured the network performance using iperf. Without revealing who the 10 GbE card vendor is, lets just say it was not 10 Gb/s of bandwidth we were observing. Or 8 Gb/s. I would like to test other cards, and we have inquiries into other vendors now as well. Yes, we should test iSER as well. If I have time, as this JackRabbit has been sold, and we were using these tests as part of our burn-in.
While doing the writes over the iSCSI link, we checked the status of the system. According to the different measures we saw, we were running about 60% +/- a little, the speed of the disk. That is, we think there is quite a bit more head room. Which could be unlocked with a faster card.
The user we were speaking with suggested a particular work-load as being useful to them for testing. So we wrote a benchmark application that generated this workload. The workload conditions were a little artificial due to the way we had to run it, but overall throughput was about 85% of single thread maximum measured network performance.
Remember, all of this is to disk, not to nullio or ram disk. Our numbers are somewhat faster than what has been discussed recently on the scst/open-iscsi/scsi-tgt lists (where we were looking for benchmark data to compare to). And we have more head room. In a sub $10,000 box.
I am hoping that we can convince other 10 GbE vendors to loan us some cards for testing. This is a seriously fast storage system, and it looks like it does a very good job of exploiting the card/link performance.