What high performance isn’t

We’ve had a number of interesting interactions with customers over the last few weeks. They all seem to center on, and around, how to get high performance out of gear which isn’t designed for high performance.

Generally speaking, you can’t.

High performance requires a mixture of design and implementation, with well designed and implemented parts.

High performance isn’t

  • A random collection of web and file servers joined together with clustering tools
  • Some random tier 1 box usually used as a lower end file server shoved with disks/ssd/Flash
  • A poorly architected, but easy to purchase system (e.g. makes IT happy, but the users sad)

I’ve talked in the past about IT clusters, and IT storage. We are seeing resurgences of these, but add to the mix, Microsoft Windows HPC server on the IT clusters (and the IT folks not knowing how to run/manage them), and (insert your favorite brand name tier 1 vendor here) IT storage, but now using a Flash card or SSD, or using Gluster to tie together individual disks on many units.

As I mentioned in the past, a pile-o-pc’s (and a blade system running windows is a pile-o-pc’s) is not a high performance cluster … its not even remotely such.

But the do-it-yourself phenomenon continues with Flash/SSD and Gluster now.

Whats wrong with this statement: “Hey, lets put an 80k IOP SSD into this machine, and our storage will be much faster”?

And whats wrong with this statement: “Hey, lets put an 100k IOP Flash card into this machine, and our storage will be much faster”?

And further, whats wrong with this statement: “Hey, lets tie our disks together with Gluster so we can get GB/s speed”?

All are decidedly not true. There are many things that impact performance, and when the rhetorical rubber meets the rhetorical road, people are often surprised at how little “much faster” actually is.

The Gluster bit just annoys me though. Gluster is a fine technology, and they have a great team. It is being marketed, to some degree, as “toss this on your random collection of boxes to make them go faster.” That isn’t gluster’s use case, and its wrong to market it in that manner.

But that is, in a general sense, how the storage software industry markets itself. Toss this on a random collection of boxen and stuff gets bigger, faster, better, tastier.

But its not. Its not high performance.

High performance is a careful balance of design and implementation, requiring attention to detail in hardware, software, and other elements of the stack. Anyone telling you any different is trying to sell you something.

Viewed 47166 times by 6668 viewers

2 thoughts on “What high performance isn’t

  1. “Viewed 13829 times by 1082 viewers”

    Really?

    I must read Joe’s posts many fewer times than average 🙂

  2. @Larry

    The search engines and spam bots have been filtered out. I believe the 1082 viewer number. Some clients read and reread. When we go in via a different browser to check things out (formatting, display, etc.), the counters increment by 1 for the default view, and only the view counter increments when I display the whole article.

    This said … I am not sure of the code paths … I didn’t write this code, came from WordPress. Wouldn’t surprise me if stuff was slightly off (not impugning WordPress or the coders … just I haven’t checked the code so I am not sure). Easiest way is to look at the database tables, and yes, we have a complete record of the visitors’s IP, client signatures, etc. Last time I looked at this, in passing a few months ago, the results of some hand queries meshed well with what WordPress reports.

    Note that this is also not an ad-supported site, and we aren’t (and don’t) astroturf, so …

    It is at least possible … that we have that many viewers. Add to this that Chris Samuel retweeted the post, as did HPC_guru, and Rich Bruekner put it on InsideHPC. Its plausible.

    Not saying the content is that good, just people came by to look at it.

    I did experiment some time ago to see if page reloads impacted things. They did impact the Viewed number, but not the viewer number. Having some multiple of the viewer number (2-4) is entirely plausible, and if we add in a few reloads (on RSS readers or similar devices), yeah, the Viewed number is possible.

    That was simply a theory to explained why they are more than an order of magnitude different.

    What I’d rather measure is the number of commenters (or authors, if others want to write something and post it here).

    Here I am lacking somewhat (c’mon lurking readers … share your thoughts!). Then again, a higher volume site like InsideHPC or ClusterMonkey has a similar issue, and we aren’t competing with them (for revenue, eyeballs, attention).

    Basically, what I am saying is that the numbers aren’t out of the question … though I do raise my eyebrows at some of them. This one has about 39k views and 3k viewers, while this one has 52k views and 2.5k viewers.

    I haven’t checked to see if going to the home page increments all the “Viewed” numbers, though I suspect it does … so there could be some inflation of that. But then again, I think the real issue is that the number of viewers is about correct … (already corrected for spam bots, search engines, etc.).

    I can tell you that this server is under pretty constant load. Not high, but you can see it.

Comments are closed.