# Highest sustained spinning rust write speed to date on a box

Yeah … the day job hardware. Current generation.
Single box. Single thread. Single file system.

Run status group 0 (all jobs):
WRITE: io=130692MB, aggrb=4702.9MB/s, minb=4815.8MB/s, maxb=4815.8MB/s, mint=27790msec, maxt=27790msec


File size is several times RAM size.Икони на светци

### 5 thoughts on “Highest sustained spinning rust write speed to date on a box”

1. And in the “but storage doesn’t matter” trenches… I keep having to read ~.25TiB off a single, *slow* disk because, well, you know. I see about 10MB/s. Makes debugging correctness rather painful. sigh.

2. @Jason
Just saw this on OpenMPI list. Reported in all its glory, reproduced without edits.

Subject: [OMPI users] IO performance
Recently the organization I work for bought a modest sized Linux cluster
for running large atmospheric data assimilation systems.  In my
experience a glaring problem with systems of this kind is poor IO
performance.  Typically they have 2 types of network: 1) A high speed,
low latency, e.g. Infiniband, network dedicated to MPI communications,
and, 2) A lower speed network, e.g 1Gb or 10Gb ethernet, for IO.  On
clusters this second network is usually the basis for a global parallel
file system (GPFS), through which nearly all IO traffic must pass.  So
the IO performance of applications such as ours is completely dependent
on the speed of the GPFS, and therefore on the network hardware it uses.
We have seen that a cluster with a GPFS based on a 1Gb network is
painfully slow for our applications, and of course with a 10Gb network
is much better.  Therefore we are making the case to the IT staff that
all our systems should have GPFS running on 10Gb networks.  Some of them
have a hard time accepting this, since they don't really understand the
requirements of our applications.



Emphasis added where needed. This jives well with our experience.

1. When designing systems, the vast majority of HPC systems have poor to terrible IO designs
2. Non-specialists who don’t understand the problem being solved should really have no say in designing the solution to the problem.
3. As often as not, bad to overtly terrible designs are foisted on end users in the name of a) maximizing CPU count, b) minimizing “extra” expenditures, c) some other relatively arbitrary measure of “goodness” or a tickbox that is checkable for their next performance review …

I once opined that I could tell the difference between HPC clusters and IT clusters. I saw one recently … well … you wouldn’t want to know what I saw. And it was firmly … soundly … an IT design. Had no relationship to any HPC design I’ve ever seen (apart from being a pile of PCs).
But storage is often … if not most of the time … given very short shrift in acquisitions. Its more of a “yeah, lets toss a 10TB disk on there”. Or “wow, these 6G backplanes are fast, lets buy lots of cheap disk and 1 underpowered fake RAID card and we will have a JackRabbit” (actually had a customer tell me that).
IO matters. As data sets grow huge, CPU performance matters, but memory bandwidth, network bandwidth, and raw IO bandwidth will matter more. Starving CPUs are a terrible thing to have. Starving CPUs because you have 1x 2TB drive reading 1/4 of itself to fill your sim … not such a fast box.
But you know this. Many don’t.
The storage bandwidth wall matters.

3. While I whole-heartedly agree, I do understand the background that rendered I/O secondary. Many traditional (as in modeled well by Top500 results) simulations require small inputs, run for a long time, and produce small outputs. And thus did the Earth Simulator fail… 20GB quotas…
For my immediate uses, there’s a difference between production (not me) and research/debugging (me). Once this is “done,” the I/O requirements are different. As in RAM or fail. But my path to that destination requires I/O. And not only local I/O but also remote I/O. sigh.

4. This post was meneiontd on Twitter by Deepak Singh. Deepak Singh said: RT @sijoe: New blog post: This could be game changing for lots of users #hpc

5. I think GPUs (the most llkiey accelerators that people will look at) are still hampered by memory bandwidth but I don’t know how much longer it’s going to be like that for. Talking to an nVidia guy the other week he didn’t think there was much on the way to help with that for the foreseeable future.Of course (a) if there was he might not have been at liberty to talk about it and (b) there’s plenty of people for whom GPUs may be good enough (yes, NAMD, I’m looking at you)..