As the discussion of the amazing performance of the K machine continues, one needs to ask how well correlated the numbers are against end user realizable and likely performance. That is, how useful is top500 as an actual predictor of system performance for a particular task? Same question of Graph500, SPEC*, etc. ? How useful is Green500 at predicting power utilization and likely throughput of a specific design?
Basically, I am not trying to minimize the efforts put into these. I am trying to ask how useful they are for real system comparison.
Way back in the early SPEC days, people made the mistake (and likely still do make this mistake) of comparing SPEC numbers, and assuming that ratios of numbers will yield even a rough performance ratio guide. What I remember from those days was taking actual customer benchmarks running on one of them thar hot alpha chips with some amazing (at the time) SPEC numbers, and comparing them to an obviously “slower” (according to the SPEC numbers) R8000 processor. The R8000 was faster at the actual test code. Customer didn’t understand it.
I think the issue is that these numbers are oversold in terms of utility. Doug Eadline wondered about this not so long ago in an article (I’ll dig out the link soon). At the end of the day, top500 really doesn’t matter. As James Cuff tweeted, its something of a … er … member length contest … more than anything else (go to urban dictionary and look it up if you aren’t sure of the context … won’t link here … its a semi-professional blog!)
I’ve had this discussion before, and never really reached a satisfactory conclusion. Most customers want to test their codes, and should test their codes. Benchies are nice, but they only tell you part of the story. In the case of storage, it may be a little simpler, but still, its a similar problem. IOP measurements tell you something, but don’t give a feel for how responsive something is.