How the Microsoft WCC could be good or bad

While thinking this through, there are a number of serious issues with WCC I can spot. I won’t go through them here, I want to mull over them for a while.

MPI. Supporting a new interconnect is hard. You have to relink your application. This is true on *all* platforms. Some such as the Scali system attempt to make this easy by separating layers, and allowing you to compile your app and select the fabric at runtime. This is a great idea. Unfortunately it is not in as common use as I would like. Most of the major ISVs don’t support Scali (a few do), most do MPICH/LAM and others. LAM has some ability to do this as I remember, haven’t looked recently, but it is in maintenance mode with new work going on in OpenMPI. MVAPICH is garnering lots of attention these days due to its support of IB, multirail, RDMA, etc.
But thats the point. There are lots of them. What is an ISV supposed to write to? Moreover, some of these are (rapidly) moving targets. Moving targets are not good from a developer point of view. Adds cost, time, reduces quality as you have to spend more of your limited resources chasing it.
This is not a linux specific problem. Windows has the same issue. What Microsoft is basically promising (which would be good) is to unify this (on windows of course).
The unification would be a good thing. A nice DLL that let us use new features when ready without rebuilding the code would be wonderful. We could do this with Linux as well. The LD_PRELOAD mechanism is very helpful. We choose not to for some reason when we build statically linked binary MPI. There are arguments for this. I don’t remember them, but some people are staunch defenders of the status quo.
While this is good, there are some bad elements of this. Most of the development that I am aware of going on in new MPI and related technologies is going on outside the windows world. So these folks are largely going to miss out until Microsoft gives the green light. This is true in more than just MPI, advanced networking, filesystems, etc.
This places the Microsoft solution firmly behind the powercurve. Maybe thats where they want to be, but I would think a better strategy would be cross platform unification.?? This way they could take advantage of developments on linux.
Finally, a way to win hearts/minds is to make the barriers to adoption as low as possible (this is what Linux did, made them effectively non-existent). This way, code could easily ported from an existing Linux port w/o significant changes. I had previously suggested the Cygwin technology for this. I might suggest again that Microsoft buy that from Redhat, make it a first class environment for Windows, and have all our comfortable tools there. Lower those barriers.
Somehow, I don’t think this will happen, but it is worth at least suggesting it.?? You may view their effort as either a good thing or a bad thing.?? More choice is good for the market.?? Vendor lockin is bad.?? Ask any customer what they think of being tied to a single vendor.?? None that I am aware of like it, many are eschewing it.?? Which is why Linux is so popular here, no one company controlls it.?? You don’t like what you see, you can choose another supplier.???? Microsoft could do good things here, or they can do bad things here.

4 thoughts on “How the Microsoft WCC could be good or bad”

  1. Thanks for the website, nice job.
    I do agree with most of what you say here :
    + being able to handle all of the world’s interconnect is a real target for everybody. The MPI library known as mpibull2 from Bull does this and is able to cover a lot of ground when it comes to do business with small companies (not able to buy high performance interconnects) and very high demanding & very rich clients (nuclear research…). Switching at runtime from interconnect is what we offer, and everybody loves it, since it fills real-life needs (scientist using small cluster and big ones, with different interconnects, and not having to recompile is not a luxury when compiling takes some hours…). Covering all interconnects from Quadrics, to Gigabit requires a big investment and is what Microsoft might lack today, unless they are able to convert people to their cause quickly.
    + a lot of scientist also like softs like Matlab. They don’t want to learn MPI stuff, nor learn C/Fortran… If I was a a scientist/financial guy and Microsoft would offer me a cluster with a click/go button to get my Matlab experiments going and my Excel sheets spread all over the cluster,I would sign immediately. What is the financial and time weight of learning to program good MPI + Fortran/C + all the environment on a Unix machine (steap learning curve ?) compared to learn Matlab or Excel in a Windows environment ? With MS, the users can concentrate on their real scientific/financial/… problems. That’s what MS offers, and I find it great. Even if the interconnect is slower (ethernet somewhere in between 10G/100G soon to come, but with not that good latencies as you can get with Quadrics stuff)

  2. Hmmm. I wasn’t aware of the Bull product. Is it related to the Scali product?
    As for learning Matlab, and running it on a cluster, as with everything, it is a cost benefit analysis.
    Cost: Matlab licenses per node. Low performance (Matlab is interpreted, though there are some compilers for some elements)
    Benefit: speed of development
    Cost: Excel license per node, and windows per node. Extremely low performance.
    Benefit: no recoding
    C/Fortran with OpenMP/MPI:
    Cost: 1 compiler license for commercial compilers, otherwise use free ones. Time required to code.
    Benefit: Very high performance (e.g. sensible utilization of an expensive resource). Very scalable code.
    The performance deltas for Matlab vs C/Fortran can be 10-100 x in favor of compiled code.
    The price delta is one Matlab license per node, and this is not inexpensive.
    The performance delta for Excel will be closer to 100-1000 x in favor of compiled code, and again the price delta will be high. If you can use the Open Source Gnumeric, Open Office, or similar, the price delta can be zero, but at the end of the day, Excel is a very low performance option. If you need to run it faster, recode the parts of it that are slow in C/Fortran, and link them into your sheet. If the whole calculation is slow, don’t use Excel.
    As for the learning curve, you can just as easily use Gnumeric/OpenOffice on a Linux cluster, with identical ease of use (and no learning curve) as you can on a windows cluster. This argument is a non-sequitur.
    You would need to do C/Fortran + MPI on either cluster. The Linux model has as many IDEs and environments for programming as does windows. If you use Eclipse, you can use an identical environment across environments If you use the Portland Group compilers, you can use identical compilers across environments. That is, this argument is also a non-sequitur
    In the Matlab case, it will be again, quite identical. They work the same.
    So then it all gets back to what is the cost benefit of using the windows cluster, and this is what I find unconvincing. The cost is there, but the benefit is not. If you are really going to run Excel across a cluster, you need to read Doug Eadline’s great article on why HPC is hard. If you are going to do serious computing, and the vast majority of CFD, structural work, etc is done using compiled code (C/Fortran, etc), you want the best price performance out of this. In this case, you need to minimize the price among several performance contenders. So even if the windows cluster was identical in speed to the Linux cluster, it would lose on price performance simply due to the extra you have to pay to build it.
    Some argue that it is lifetime/TCO that we need to measure and that linux is not free. Correct, it isn’t free. All the systems have similar long term costs. Upfront costs are dissimilar. This impacts the lifetime price performance.
    Others argue that it is easier to integrate windows with windows. Linux talks to everything, and can authenticate against AD, NT, NIS, LDAP, Kerberos, …. so again, this argument is a non-sequitur, though I see it being used to justify this effort.
    My summary is that I failed to find a compelling case for WCC. We will deploy it if our customers request it, and we have asked them do they want it. None have indicated interest.

  3. Hi Joe,
    To answer your question : Bull products have their own life, and are sold with Bull hardware. We’re not affiliated with Scali.
    I do agree with your analysis for all but one point, people might need it, to get HPC democratized in small businesses, those who can’t afford paying expensive clusters or runs on huge clusters. “to do serious computing” (using high perf stuff) is only a subjective point of view. We are in this HPC business and we know what it takes, and how a version of icc compiler can change the whole code optimization, how a lock free stacks impact performances … but :
    + some people want to use computers as a very abstract tool and do not want to get involved in the cathedral of bazar. I know some of them who work on very “serious” biomecanical problems, solving all their problems with Matlab. In the article I read on WCC yesterday, the guy said he used 15 years to master MPI and all going with it. How can you compare 15 years of investment (do you remember the time you started on Linux ?) and few hours more of a matlab run on a WCC cluster ?
    +That’s the market Microsoft might aim. I agree with you, they can’t reach the level Linux has for now, in performance, in flexibility, in opensource… If I was MS, I would do a marketing stuff like “Small Business Cluster Kit”, packaging WCC and parallel applications (Parallel Excel, Matlab…), finding new pricings for licenses. I would convince people to develop software on WCC by giving the money (they already done that by the past), with one goal in mind : bring parallel applications to everybody.
    My conclusion is MS might take a share of the market nobody is on : HPC for non-computer-friendly-guys, using GUI and integration everybody knows. Otherwise, they’ll have hard time catching up.

  4. Interesting (the Bull product).
    I agree that people want to think at a higher level of abstraction. In fact I encourage this. The very important aspect to remember is that nothing is free, as in, you have to pay a price for thinking in this abstract sense. That price is in this case, lower performance.
    Think of it like this. Lots of people are enamored of programming in object oriented manners (OOM). That is great, until you start looking at some of the code they write, and are struggling with to make go fast. These folks have been taught about design patterns using object factories, serializing an creating complex data structures …
    … all of which are anathema to high performance. The deeper your data structures are, the more dereferences you will need to do to get access to your data, which means that for each access you have to pay memory bandwidth, and if you are not careful, memory latency (and in really bad cases, tlb flushes).
    Moreover these design patterns encourage a highly non-optimizable coding style itself, in that you have objects that inherit from each other, and as a result, you need to walk symbol tables and heirarchies to figure out which function to call.
    The point of this is, that it is easy to think about these codes and their results at higher levels of abstraction. It is just very hard to get good performance out of highly abstract codes. Google for abstraction penalty to see some of what we mean here.
    As for “cathedral and bazaar”, I know of very few people who buy their machines or software to make a philosophical statement. All of them that I am aware of buy and use machines as they need them. There are practical considerations. We are quite often answering performance related questions, and sizing systems based upon anticipated use cases. This doesn’t get harder with more abstract languages, though the clusters get much larger due to the performance loss associated with running an interpreted code (or an OO code for that matter).
    As for using Matlab on serious bio-mechanical problems, sure. I can believe this. I have used Matlab myself in one form or another since about 1988, so I understand it pretty well, and have done some nice calculations with it. When I needed more speed, I recoded those calculations in fortran (and more recently C). Throwing a cluster at a problem better served by rewriting the expensive portion of the code in a language more suitable to high performance computing, seems to be a waste of serious money. I am sure you and Bull are not encouraging such expenditures.
    As for 15 years of investment in MPI … I was under the impression that it is fast coming up upon a decade of age, so I am not sure how to respond to this. MPI can be “mastered” in a few weeks of programming. You can do 90+% of what you will ever need to do with 5 function calls. Add in a few reduction calls and another set of things, and you might hit 10 function calls.
    What makes MPI, and actually HPC hard is that you have to think about it ahead of time, and plan for it. OpenMP didn’t give parallelism for free. It cannot. But it is IMO one of the best models for doing so. MPI is hard to learn as it forces you to re-think how you program, and deal with explicit sharing and data motion. None of this is easy, but it doesnt take a PhD in computer science either.
    This is not by the way, the market Microsoft is aiming for. This is the market that Matlab is aiming for, and that is a good market. I use Octave (free Matlab clone) to prototype complex calcs before coding. It works well, it just isn’t fast. Once I have the calc well designed/understood, the conversion to C (or Fortran) is relatively simple. This usually nets you factors of 10 performance delta for a comparatively low effort. I have not benchmarked Matlab recently, would love to revisit this, but little time for it.
    What Microsoft is aiming to do is to remake clusters in their image. The cluster market has had a 3 year growth period (actually much longer), that is nothing short of amazing. Microsoft wishes to tap into this. I have pointed out in other posts here that it appears that this is mostly a tactic in their war against Linux, and it offers little in the way of real incremental value over what others are doing.
    That is, I can easily build a Matlab cluster today. This won’t impact what runs on the cluster, it simply is a statement of reality. Matlab runs on Linux today. See this link for more details. Nothing about this precludes a user from using it like this, today. Microsoft doesn’t add anything to this.
    As for new pricing, I agree. The WCC competitor has a $0 acquisition cost. Microsoft needs to be aware that when you are looking at a 16 node cluster, cost is very important. The extra $8000 US you get to spend on the Microsoft tools will factor into the decision process. As will the extra $1600 on antivirus and firewalls. What I am saying is that $469 was not a wise pricing model. $46.90 would be much better, but still not perfect.
    As for parallel excel …. I have a simple orbit calculator using a Symplectic integrator I coded up about 3 years ago, in Excel. Takes a really long time to run. Would parallel Excel help? Of course not. And this is what many people don’t get. Tossing a code onto a cluster does not make it parallel. Parallel code is (sadly) hard.
    Microsoft will likely win in those all-windows shops when they appeal to the C-level execs. It won’t be the technology that causes a win for them.

Comments are closed.