Having fun writing a presentation about molecular dynamics and big data

Who’da ever thunk that MD simulations would start to become large enough to present IO and analysis problems?
Way way back when the digital supercomputing dinosaurs roamed the earth, looking for problems to crunch on, I simulated gallium arsenide on some of these machines.
I’d be lucky to get 100 time steps done, in a week, for 64 atoms. 64 atoms in double precision, with position, velocity, and atom type, lets be generous and call this 64 bytes in binary or 80 bytes, one terminal line, per atom in text. 64 atoms gave me less than 5kB/time step.
And yes, it really did take a week per 100 time steps back then. Newer hardware got this up to 5 minutes per time step. Ok, this was newer hardware in 1995. Your mileage may vary. Supercells in the mirror are larger than they appear.
And yes, the last time I ran the code on a laptop (6 or so years ago) it took … well … 10 seconds or so per time step.
But whats interesting to me today is that researchers are aiming for micro and millisecond simulations, with reasonable physics and chemistry theory levels (that is, not simply a baseline hard ball or Lennard-Jones potential, but something that could have realistic meaning). Which, if you are using time steps of order of 1 picosecond (10-12 seconds), means you have hundreds of millions of time steps to do. And lots of output to generate, and analyze.
This rapidly becomes a big data problem. Now take many of these simulations for large screening operations, and its a “oh my gosh its coming this way” data size problem. And no, this is not a cloud issue. Well outside of the realm of public clouds sweet spots.
Massive data needs massive firepower. Large scale simulation needs scalable storage and computing tightly coupled to it.
And I have to admit, I am having a great deal of fun looking up recent journal articles. I even cracked open my thesis and old writings. I was amused when I saw a journal article on something I had commented about informally about 8 years prior to that publication (and alluded to in my thesis). The wikipedia article on this phenomenon reads like the notes I had written on my observations.
Well, back to the presentation. Its fun to (tangentially) work on science-like things again. I get to do too little of this. Though I have to be careful what I say, lest my wife think I want to go be a physics prof somewhere.

1 thought on “Having fun writing a presentation about molecular dynamics and big data”

Comments are closed.