On the state of Microsoft's HPC effort

Reports were in that a Windows based system was able to crack the 1PF barrier, but that the same system running Linux, was faster.
Cudos to Microsoft for this … but I have to ask … really … if this statement from Bill Hilf is true:

“We’re not trying to beat Linux,” Hilf says. “We’re not trying to be a supercomputing company. We’re trying to say ‘how do we mainstream all of this stuff so that HPC becomes broadly available at all levels.'”

then why is Microsoft competing in the stratospheric regime of performance if its not trying to be there?
I see a fundamental disconnect between actions and words. You can ignore words, not actions. Microsoft is looking for a perceived win against Linux that it can take to marketing. This is of little real value to them, and I hope they recognize this soon.
The reality of the cloud and cycles upon demand models, which I believe will come to dominate HPC in the near future for everything not desktop based, is that the OS will be an incidental implementation detail. The lowest cost OS is going to win the majority of the use cases (as it is now). Economics again, pure and simple.
So its actually in Microsoft’s best interests to figure out a) how to give away their OS for nearly free for HPC sites, and b) how to run it diskless/iSCSI based via PXE booting/installing, and make it as trivial to install and operate as Linux is. This is IMO far more important than climbing the top500 list. If they don’t the risk being in the same state they find themselves in the cellphone/PDA/smartphone market.

7 thoughts on “On the state of Microsoft's HPC effort”

  1. @Anon
    I should also point out that the marketing folks are having their fun, albeit with some … well .. dubious at best, claims. The money quote in this

    But if Windows HPC Server were truly less expensive than Linux, even in large-scale clusters, you might expect to see significant usage of Windows in the world’s largest supercomputers. That is not the case. Among the 500 fastest supercomputers on the planet, just five run Windows and 455 run Linux, according to the Top 500 list of supercomputers.
    If Windows really is cheaper than Linux, then a lot of people are wasting money.

    The point is, most customers know better. I expect that had Microsoft had a 1PF score on the same machine greater than or equivalent to the Linux score, we’d here all about “how Windows 2008 HPC server R2 is faster than Linux” … or other such things.
    This said, given the number of Linux installs at end user sites, we are finding more and more very competent Linux admins out there, even in mostly windows shops, which make for a much lower (e.g. as in no training required) for use of Linux based clusters and systems.
    So I’d argue that its unlikely that the basic premise (windows only and no Linux experience) is the norm at many places. But Microsoft is making hay of this. That was my point for the original post.

  2. also … I should point out that it would be to their benefit to figure out how to have their OS boot from a Linux PXE boot server. Last I checked a few months ago, this fell under the category of ‘hard’

  3. This is interesting

    Tsubame is a remarkably energy efficient, general-purpose supercomputer with about 2,000 users in academic and industry research circles. Because Tsubame uses a KVM hypervisor and various cloud-like provisioning tools, it can run both Windows and Linux at the same time on different nodes, and offer users various types of processing configurations.

    What’s interesting about this is that KVM is a Linux kernel hypervisor. Which means, while it was running Windows, it was running Linux underneath to handle the hardware. Interesting. And dare I say, this is precisely in line with what I mean when I remark that the OS is just an implementation detail on the compute nodes.
    I wonder which set of cloud provisioning tools they used. Eucalyptus, or some of the others?

  4. We at Schr?dinger ported several apps to Windows HPC Server a year or so ago. We benchmarked them and wrote a white paper, and were happy with the results. We were not able to benchmark Linux on the same hardware, but what I have seen leads me to believe that the results would have been essentially identical. Our motivation was supporting small customers who are Windows-only and who had no upgrade path with us other than Windows clusters. Also, I spoke to a few people in the fluid-dynamics and finite-difference community for whom Windows HPC ports have been a great commercial success. Unfortunately, in our space, our target market (biotechs) dried up with the economy, though it does show some signs of reviving now. Our sysadmins, who come from UNIX/Linux backgrounds, were pleasantly surprised with the ease of configuration and administration of WinHPC clusters. The monitoring tools are great, and the guy who did most of our WinHPC admin said that it would certainly be easier for a Windows admin with no prior HPC experience to configure and run his first WinHPC cluster than it would be for a Linux admin with no prior HPC experience to configure and run his first Linux cluster. (We’ve been running Linux clusters in our company for over 10 years.) As far as Microsoft’s real motivation is concerned, it’s hard to tell. They have a history of pushing something for a few years and then pulling back. Lately, they seem to be emphasizing Java over .NET, and have pulled back from Iron Python and similar initiatives. In HPC, they’ve definitely pulled back this fiscal year. Several years ago, one of their people told me that Microsoft’s real motivation is to learn the technology, learn the ropes, then transfer/apply it to more mainstream apps. Perhaps believing this is to ascribe too much method to the madness, but on the other hand, it’s consistent with what I’ve seen. And by the way, they’ve been great folks to work with.

  5. (I don’t know why your comment system thought I was anonymous above)
    None of the Tsubame linpack runs (Windows or Linux) were done running under a hypervisor.
    I’ve never heard of a top500 result with the OS running in a VM. Maybe Amazon’s entry did?
    John Vert

  6. @John
    WordPress likes people to log in for comments if they don’t want to be anonymous.
    The Tsubame description suggested everything was running as a KVM instance.

    Tsubame uses a KVM hypervisor and various cloud-like provisioning tools, it can run both Windows and Linux at the same time on different nodes, and offer users various types of processing configurations

    Which does mean that if they were running KVM at the time, then you were virtualized (and so was Linux), and if you weren’t running KVM at the time, then likely you weren’t running in the end user configuration, which means the numbers are dubious. The Tsubame people are pretty bright. I don’t think they would run on the system in a different way than end users would use it.
    We’ll need to get some clarification on this.
    I haven’t seen the latest list, so I don’t know if Amazon has an entry.

Comments are closed.