Blue waters are a-movin…

NSF is funding a 2×10^5 processor monster machine at NCSA. At $208M, each dollar will by you 4.8 MFLOP (4.8×10^6 FLOP).


Assuming a quad core CPU would be able to provide (in theory) 32 GFLOP (4 cores x 8 GFLOP/core), you would need 31,250 units to provide this … (125000 cores).

There are some interesting things about this machine. Very interesting … not just the price tag or the estimated sustainable performance

It is a shared memory machine. Quoting the article:

All of that memory and storage will be globally addressable, meaning that processors will be able to share data from a single pool exceptionally quickly, researchers said.

Ok.. they did say globally addressable, not necessarily “shared”. These have slightly different contextual meanings.

I want to know how they are going to program it. If it is a shared memory machine, then, technically, we could use OpenMP. Which means I could write simple loops (in theory), and have them spread far and wide (in theory). In practice this doesn’t work well without some significant help and hints from the compiler/user.

Or maybe, this is just a 447×447 cell spreadsheet for Amir with each cell being a processor, local ram, and some local code.

We are getting to the point, rapidly, where we may need to think about processors as being part of a continuoum, a hive, and not as discrete entities unto themselves. This harkens back to Doug Eadline’s articles on how self-organization in large colonies tends to evolve successful models of behavior for programs … er … ants and insects.

Viewed 7200 times by 1471 viewers