Doing a bit more performance testing on the big JR4

[This was an older post from a few weeks ago, sitting in my queue. Cleared it out]

Want to burn it in. Played with an experimental kernel, and found the Mellanox drivers wouldn’t build. Too many things have changed from 2.6.27 to 2.6.29.2.

Ok, reloaded with Centos 5.3. Will stress test the default kernel. For some reason, we were hitting a strange SSD-RAID interaction, so I swapped out the SSD pair for spinning rust pair. While SSDs have lots of promise … I am getting a sense that they still don’t have everything working quite the way we want across all vendors. The Intel SSDs are very good, but some of the other Industrial SSDs, designed for these applications, appear to have some occasional gotchas.

Ok. Time for some analysis.

First, a snap of dstat while fio was lighting off …

<code>
----total-cpu-usage---- -dsk/total----dsk/sda-----dsk/sdb-----dsk/sdc-----dsk/md0-- -net/total- ---paging-->
usr sys idl wai hiq siq| read  writ: read  writ: read  writ: read  writ: read  writ| recv  send|  in   out >
  0   6  94   0   0   0|   0     0 :   0     0 :   0     0 :   0     0 :   0     0 | 324B 1528B|   0     0 >
  0   6  94   0   0   0|   0     0 :   0     0 :   0     0 :   0     0 :   0     0 | 192B  532B|   0     0 >
  0   6  94   0   0   0|   0     0 :   0     0 :   0     0 :   0     0 :   0     0 | 252B  548B|   0     0 >
  0  15  85   0   0   0|   0  4389M:   0  1098M:   0  1096M:   0     0 :   0  2198M| 192B  548B|   0     0 >
  0  15  85   0   0   0|   0  3483M:   0   870M:   0   872M:   0     0 :   0  1738M| 252B  606B|   0     0 >
  0   6  94   0   0   0|   0     0 :   0     0 :   0     0 :   0     0 :   0     0 | 132B  564B|   0     0 >
  0   6  94   0   0   0|   0    32k:   0     0 :   0     0 :   0    16k:   0     0 | 252B  548B|   0     0 >
</code>

Ignore the total, and look at the MD) burst writes. These are likely some interesting cache effects. Need to probe it more. But it seems to have written about 4GB in about 2 seconds.

Ok. Just did a streaming direct io read and write of 128GB IO files using fio

[root@jr4 ~]# fio streaming2.fio 
streaming-write: (g=0): rw=write, bs=4M-4M/4M-4M, ioengine=vsync, iodepth=768
streaming-read: (g=1): rw=read, bs=4M-4M/4M-4M, ioengine=sync, iodepth=768
Starting 2 processes
streaming-write: Laying out IO file(s) (1 file(s) / 131072MiB)
streaming-read: Laying out IO file(s) (1 file(s) / 131072MiB)
Jobs: 1 (f=1): [_R] [100.0% done] [1710339/     0 kb/s] [eta 00m:00s]               
streaming-write: (groupid=0, jobs=1): err= 0: pid=14164
  write: io=125GiB, bw=<strong>1,368</strong>MiB/s, iops=350, runt= 93587msec
    clat (msec): min=1, max=1,897, avg= 2.87, stdev=26.73
    bw (KiB/s) : min=    0, max=2293760, per=115.08%, avg=1611854.41, stdev=516231.39
  cpu          : usr=0.12%, sys=11.80%, ctx=32845, majf=0, minf=30
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.8%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/32768, short=0/0

     lat (msec): 2=20.76%, 4=74.54%, 10=2.28%, 100=0.01%, 250=0.01%
     lat (msec): 500=0.02%, 750=0.01%, 1000=0.01%, 2000=0.03%
streaming-read: (groupid=1, jobs=1): err= 0: pid=14168
  read : io=128GiB, bw=<strong>1,607</strong>MiB/s, iops=401, runt= 81572msec
    clat (msec): min=1, max=362, avg= 2.49, stdev= 2.80
    bw (KiB/s) : min=15544, max=1785856, per=100.14%, avg=1647628.49, stdev=175666.18
  cpu          : usr=0.20%, sys=18.49%, ctx=32834, majf=0, minf=1056
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=32768/0, short=0/0

That 1368MiB/s is 1434.5 MB/s. The 1607 MiB/s is 1685 MB/s.

This is with the updated RHEL/Centos 5.3 kernel. I should point out that this is on the older xfs release. Lots of performance and bug fixes in the newer code.

This kernel feels more stable. Of course it hasn’t seen octobonnie yet. Coming in a bit. Will be running continue octobonnie this evening. Just to see what it can do.

Viewed 7331 times by 1501 viewers

Facebooktwittergoogle_plusredditpinterestlinkedinmail