There is a clear and present need for meaningful metrics for HPC and storage

As the discussion of the amazing performance of the K machine continues, one needs to ask how well correlated the numbers are against end user realizable and likely performance. That is, how useful is top500 as an actual predictor of system performance for a particular task? Same question of Graph500, SPEC*, etc. ? How useful is Green500 at predicting power utilization and likely throughput of a specific design?
Basically, I am not trying to minimize the efforts put into these. I am trying to ask how useful they are for real system comparison.
Way back in the early SPEC days, people made the mistake (and likely still do make this mistake) of comparing SPEC numbers, and assuming that ratios of numbers will yield even a rough performance ratio guide. What I remember from those days was taking actual customer benchmarks running on one of them thar hot alpha chips with some amazing (at the time) SPEC numbers, and comparing them to an obviously “slower” (according to the SPEC numbers) R8000 processor. The R8000 was faster at the actual test code. Customer didn’t understand it.
I think the issue is that these numbers are oversold in terms of utility. Doug Eadline wondered about this not so long ago in an article (I’ll dig out the link soon). At the end of the day, top500 really doesn’t matter. As James Cuff tweeted, its something of a … er … member length contest … more than anything else (go to urban dictionary and look it up if you aren’t sure of the context … won’t link here … its a semi-professional blog!)
I’ve had this discussion before, and never really reached a satisfactory conclusion. Most customers want to test their codes, and should test their codes. Benchies are nice, but they only tell you part of the story. In the case of storage, it may be a little simpler, but still, its a similar problem. IOP measurements tell you something, but don’t give a feel for how responsive something is.

1 thought on “There is a clear and present need for meaningful metrics for HPC and storage”

  1. I’ve had the same thought for a while. I think it would be really interesting to regularly poll CPU performance counter metrics (ie instruction counts, cache hits, branch misses, etc) and correlate them with what processes are running on a node. Over time you could build records of how individual applications run and with what variability. If you further correlated it with IO and network statistics you could start getting a really good view of how the system is being utilized.
    Still doesn’t give you a good idea of how much science is being done though. 😉

Comments are closed.