NSF is funding a 2×10^5 processor monster machine at NCSA. At $208M, each dollar will by you 4.8 MFLOP (4.8×10^6 FLOP).
Assuming a quad core CPU would be able to provide (in theory) 32 GFLOP (4 cores x 8 GFLOP/core), you would need 31,250 units to provide this … (125000 cores).
There are some interesting things about this machine. Very interesting … not just the price tag or the estimated sustainable performance
It is a shared memory machine. Quoting the article:
All of that memory and storage will be globally addressable, meaning that processors will be able to share data from a single pool exceptionally quickly, researchers said.
Ok.. they did say globally addressable, not necessarily “shared”. These have slightly different contextual meanings.
I want to know how they are going to program it. If it is a shared memory machine, then, technically, we could use OpenMP. Which means I could write simple loops (in theory), and have them spread far and wide (in theory). In practice this doesn’t work well without some significant help and hints from the compiler/user.
Or maybe, this is just a 447×447 cell spreadsheet for Amir with each cell being a processor, local ram, and some local code.
We are getting to the point, rapidly, where we may need to think about processors as being part of a continuoum, a hive, and not as discrete entities unto themselves. This harkens back to Doug Eadline’s articles on how self-organization in large colonies tends to evolve successful models of behavior for programs … er … ants and insects.