on the test track

By joe

February 21, 2009 - 3 minutes read - 582 words

One of the issues often raised in discussions with users are IOP performance of JackRabbit. We have measured our 24 bay unit performance at a bit more than 5000 IOPs (8k random reads, as closely matching a test case handed to us by a customer looking at a competitive box, which scored under 4300 IOPs on the same test). The problem is that getting consistent workable tools to do this measurement is hard … windows users use IOmeter, other users will use SPC-1 and related. But we want a nice consistent tool for the job. I had thought of rolling one, until I found Jens Axboe’s fio tool. With this, I can recreate the relevant workloads fairly easily. Of course, the purpose of this post is not to talk about fio. But instead to talk (a little) about a new project we took out to the test track. This is an IO system designed for speed. We wanted to do some time trials in the usual manner. Then crack the throttle wide open, and stand out of the way of the bow shock. Which is what we did.

These are the first tests of “velocibunny”. First, our fio run case (a file named random.fio) [random] rw=randread size=8g directory=/**velocibunny**/data iodepth=192 direct=1 blocksize=8k numjobs=16 nrfiles=2 group_reporting Machine has 16 GB ram, so 16*8GB of test gives 128GB of testing data. Put another way, this is 8x (system ram) cache. Lets just accept that what you will see below are uncached results. That and I turned on direct IO to insure no caching … Second, some simple read and write tests. More in a little bit. streaming read

root@pegasus:~# dd if=/velocibunny/big.file of=/dev/null ...
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 37.947 s, 905 MB/s

streaming write

root@pegasus:~# dd if=/dev/zero of=/velocibunny/big.file bs=1M  ...
32768+0 records in
32768+0 records out
34359738368 bytes (34 GB) copied, 65.754 s, 523 MB/s

Third, IOPs measurements, using the above random.fio

 root@pegasus:~# fio random.fio
random: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=sync, iodepth=192
...
random: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=sync, iodepth=192
Starting 16 processes
...
Jobs: 1 (f=2): [____r___________] [99.9% done] [ 25124/     0 kb/s] [eta 00m:01s]]]
random: (groupid=0, jobs=16): err= 0: pid=22631
  read : io=131072MiB, bw=192430KiB/s, iops=23490, runt=714225msec
    clat (usec): min=53, max=34308, avg=668.35, stdev=50.14
    bw (KiB/s) : min= 8355, max=26001, per=6.30%, avg=12125.57, stdev=162.19
  cpu          : usr=1.28%, sys=4.71%, ctx=32759646, majf=0, minf=13491
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=16777216/0, short=0/0
     lat (usec): 100=0.01%, 250=0.01%, 500=15.16%, 750=59.60%, 1000=19.33%
     lat (msec): 2=5.80%, 4=0.10%, 10=0.01%, 20=0.01%, 50=0.01%
Run status group 0 (all jobs):
   READ: io=131072MiB, aggrb=192430KiB/s, minb=192430KiB/s, maxb=192430KiB/s, mint=714225msec, maxt=714225msec
Disk stats (read/write):
  sda: ios=16777216/304495, merge=0/219870, ticks=9584400/3768460, in_queue=13337950, util=84.47%

Yes, that is not a typo. 23490 IOPs. This is close to an order of magnitude faster than our existing JackRabbit unit for seek bound loads (as measured by a unit going out the door shortly). Notice that the IO bandwidth is pretty good as well. There are quite a few things we haven’t said, and won’t cover here. What we will say is that this is being productized, it is coming soon, and as you can see by these results, we expect this to be … interesting for our customers whom are bound by IOP speeds, and latency in general. Anyone interested in the details we aren’t saying here, contact us, we will have you execute an NDA, and go from there.