"The Grid"(TM) (with extra hype, no information content …)

I read an “amusing” piece this past weekend, where people connected with the LHC project at CERN talked about how they would do data distribution and computation. Basically they are building their own data network, and doing some interesting bits with large volume data caching/distribution.

Then we get this.
You know, the “Grid” will make the internet obsolete.
Oh bother.
Let me ask a simple question of said media outlets. Is it possible to, say, assign stories to people with an understanding about what they write? Or at least vet the story with someone with a clue?
(shakes his head in disbelief)
The Grid is an unfortunately over used/abused term to describe a distributed computing paradigm that is more loosely coupled than a cluster. Its current incarnation is what you hear in the “cloud” computing discussions. I would argue that a better term would be “nebula”, though that is a little tongue-in-cheek.
Using the current “cloud computing” terms, there are quite a few applications that do benefit from distributed computing implementations. Distributed data processing is one of them, and that is being used for the LHC work.
The problem in any distributed computing model is data motion. If data motion is your problem, as in your codes algorithms depend upon timely delivery of data, then you are bound by one or more of latency (how long you have to wait for data to be available after requesting it), or bandwidth (how much data you can pump down the pipe to your program).
LHC folks built a data distribution network above the normal internet, as the internet connections did not have the necessary bandwidth out to the primary data sites. The secondary sites are connected (as I understand) by dedicated bandwidth lines, possibly with bandwidth reservation over the internet connections, or possibly with their own fibre pulls. The tertiary sites are connected via internet as far as I know.
Yet this article breathlessly talks about downloading DVDs in 2 seconds. It talks about how the Internet has been made obsolete by this distribution network.
Maybe it is time for those media outlets to hire scientists and engineers who can write, and explain things, rather than having a non-technical person attempt to string nice sounding technical “factoids” together into something that on the face, looks like a story, but really isn’t.
LHC is building a data distribution network. They are doing so as the existing internet connections between primary and secondary data centers are not fast enough to handle the extreme data outflows. From there, they will be distributing the data to computing systems in a loosely coupled conglomeration called “the grid” which will handle the tremendous volume of computing tasks needed to find signals in their noise. They need so much data as their events are very rare, they need to gather as much information as possible in order to get reasonable statistical measures and error estimates. The computing load is intense. Only distibuted groups of machines (clusters and grids) could handle it. The data load is intense, several gigabytes per second.
Sounds quite a bit different from “downloading DVDs in 2 seconds” now, doesn’t it.

3 thoughts on “"The Grid"(TM) (with extra hype, no information content …)”

  1. Reading article in Times I have strange that something is wrong, but thought that I maybe missed some aspects of Grid computing…

  2. Hey you got it. Thanks a lot for this article. You just said what I wanted to. I Already bashed a couple of hyped media articles. I am now with grid computing for 3 years (MSc in Grid Computing with Univ of Amsterdam and now PhD ..).
    These news reports are really hilarious. I could not stop myself laughing a bit after reading one of the article that says that now a movie from Japan to Europe will take 3 seconds to download. How dumb of a reporter who writes this in the context of a CERN grid.
    Anyways… hope the media comes of age and write more sensibly.

  3. @Wawrzek
    Took me 2 reads, and a cup of coffee to realize that my brain wasn’t in “non-working” order …
    We need some practitioners in the field to explain to a writer in simple language, what grid/cloud computing is, what it brings, and how it will be useful. Then we have to help said writer constrain their creative writing urges.
    The sad part about the hype is, there is real value in the concepts, ideas, and hopefully soon, the implementations. There is significant value for business (and you can see it in terms of a cloud of VMs) for DR/continuity, and all sorts of other things (scale-on-demand). But articles like what I read do a disservice to the concepts. Well, that and some of the marketing folks who happily play into the nebulous wording …

Comments are closed.