# then afterburners kicked in …

… sumthin fierce …
This could be (the) fastest 4U box on the market for streaming, which doesn’t use RAM for storage.

Run status group 0 (all jobs):
READ: io=761904MB, aggrb=7455.4MB/s, minb=7634.3MB/s, maxb=7634.3MB/s, mint=102196msec, maxt=102196msec


That streaming is more than 8x RAM size. No PCIe flash cards in the unit. None. Zero. Zilch.
yeah BABY!!!
Right now, running a random read of that data set. 8k random reads across the entire 700+ GB data. Obviously not cached. Running about 330k IOPs. This is still somewhat disappointing relative to the “theoretical max” numbers, but actually its not terrible, and very much along the lines of what I expected.

  read : io=786432MB, bw=2578.2MB/s, iops=330003 , runt=305037msec


I think we can call this one a win …

### 9 thoughts on “then afterburners kicked in …”

1. That’s pretty awesome. Hardware SSD RAID hinted at in previous post? Out of curiosity, any performance drop after you’ve filled the drives a few times, and/or does the card support TRIM (or do garbage collection)? Have you seen any issues with sector alignment degrading performance?

2. @David
All units supports trim. No degradation over time. Sector alignment is important to performance for any SSD/flash based system. If you look at the theoretical numbers for most SSDs, they specifically talk about 4k aligned writes.
No heroic tuning to get here, yet. Thinking about some less invasive tuning we can do.

3. Would be interesting to see IOPS as the operation size changes (downward)… From the graph analysis perspective, 330k IOPS per storage unit is quite interesting.

4. @Jason
Please give me a range that you want to see. We’ll see if we can run it.

5. I’ve been trying to think of something other than “mmap the device and run GUPS”… Our uses at the moment are for 8×5 and 128×6 chunks of 64-bit data. No, neither power of two nor page-happy. I’m trying to contort things to at least be cache-line-friendly without loosing much storage.
And I know this isn’t a reasonable application target for storage… But if it’s reasonable, there’s not much interest in asking. 😉
(BTW, some of the folks curious are the same ones who subsidized a certain TLA company’s hardware development and are getting stabbed…)

6. Is this TRIM the bad (non-queuable) one from the pre 3.1 SATA spec or the queueable one from 3.1+ ?

7. @Jason
This unit is shipping on Monday and is boxed up, so we can’t do runs on it anymore. This said, we may be building another one quite soon, of identical design. The financial types really like them.
On the certain TLA and getting “free” research dollars for their product R&D … I am just not hearing the alternative side of this. I am looking for it. If someone wants to talk to me anonymously, I’d be happy to arrange for this. It sounds to me like a portion of the story is out, and I am sure I can interest various publications in a well researched article on it. And I am intensely curious as to the other side. Right now, I see what amounts to a perfectly reasonable business decision on the TLAs part. But all decisions have consequences, and the impact can be non-local. As noted, I am very curious about this. Ping me offline if there is interest, and we can work out a mechanism.
@Chris
Not sure which variant on these units. Will research. Just did a firmware update to correct for other issues.

8. Hi Joe,
very nice results for a single machine – kudos! 🙂
Can I assume that you used xfs and a quite recent kernel to squeeze these numbers out of that unit?
Btw: In the JRFlash comlumn at http://scalableinformatics.com/jackrabbit, I see “IOPs/chassis: 1800k (SSD) to 1800k (PCIe Flash)”. The SSD number is probably a typo.

9. @Sven
Yes to both. Will be fixing the SSD number later on. Thanks! It should be ~330k (which is what we measured). The PCIe Flash number will be changing too … (big smile)