# Yet another puff piece …

On windows clusters. They quote Don Becker, cluster illuminati, who made some quite pointed and correct observations. They quoted some marketing types from other organizations who don’t appear to be technical, and don’t grasp what “hard to install” actually means.

Aside from that, one of the least painful aspects of a cluster is “how hard it is to install”. The most painful is the cost of running it, specifically managing users, and applications. As the size of the system scales, so do per unit costs. In the case of windows, the $469/unit cost means that a moderate 16 node system + head node + file server adds another$8500 to the purchase cost. Not to mention the yearly additional costs of the OS support, the necessary per node anti-virus, the necessary per node anti-spam … That would add in another about $1500 or so. So call it an addition$10,000 per 16 node cluster.
For $250k, you can get a pretty good 32 node system, or if you can deal with lower end hardware (dual core, less memory per node, no IB), you might be able to get in 64 nodes. With 64 nodes being about$250k, the Microsoft platform cost is about an additional $40k. Which means you are really spending$290k. Or if your budget is fixed or shrinking, you are going to spend $250k, and get say,$40k less nodes. Thats about 10 fewer nodes, to have the privilege of running windows and all its additional required software. Thats 40 fewer cores. For the same money.
How is this a win?
Oh yes, somehow, you save money. By spending more.
Our customers are pretty adamant about getting more for less as time goes on. We agree, they should get more processing power for less cost per processing power. Unit costs of nodes have remained in the $5k region (+/-$2k). Adding $469 adds 10% additional cost per node. Add in the antivirus and antispam, and I claim it looks closer to$650/node. But I could be wrong on this.
How is adding more cost per node, saving money as you scale up the number of nodes?
If W2k3 CCS were free (e.g. zero acquisition cost), it would be a reasonable direct competitor with Linux. Which is zero acquisition cost. On the support cost side, setting up and managing Linux clusters is fairly easy. In many cases, it can be done with no hassle in about an hour from a bare-metal system, by a relative novice admin, on systems of up to several hundred nodes. I am sure that CCS may be doable in a day or less with some expert help, on a small 16-way system.
Moreover, the tasks involved in managing jobs is IMO much much harder on CCS. Take for example, the process to list jobs on one particular node. Lets compare that with a linux cluster, shall we?

[landman@xxxxxxxx:~] 7 >qstat -f -t -q all.q@dualcore
queuename qtype used/tot. load_avg arch states
—————————————————————————-
all.q@dualcore BIP 3/4 0.07 lx26-amd64
101 0.55500 sleep landman r 07/18/2007 21:34:17 MASTER
103 0.55500 sleep landman r 07/18/2007 21:34:30 MASTER
104 0.55500 sleep landman r 07/18/2007 21:34:32 MASTER

Let me ask, which of these two is easier to deal with? Which one would be more accessible to an end user?
Yeah, one example, but there are multitudes more behind it. I suspect the marketeers commenting on the “ease of installation” and the “no differences” really haven’t done much in terms of managing/building/running their own clusters.
As Don points out, Microsoft has nearly infinite resources to pursue something, no matter how bad a business fit it is for them. Part of those resources appear to be marketing money, “helping” resellers push their product. Just remember that next time one comes a-calling.
Meanwhile we are left with the indelible impression of a small market segment (CCS) that is not growing as fast as the cluster market as a whole (which means it may be shrinking in relative terms). This doesn’t give us a “warm and fuzzy” feeling on porting/supporting W2k3 CCS for products. Sort of like Solaris, which is, as data strongly suggests, is on the decline.

### 6 thoughts on “Yet another puff piece …”

1. I was about write my own response to the article, but I figured you would post a much better one :). Completely agree with pretty much everything you say. I do think CCS has a place, but not for serious scientific computing. Microsoft will have to work very hard to get applications properly ported to CCS for any long term success. Could they do it? Yes I suppose they could. Hopefully they will stick to their original goals and aim for small successes.

2. I don’t know if it is better …
I agree, CCS has a place, I just don’t believe it is where they think it is, as I remain un-convinced of the business case. OTOH, I see a *huge* opportunity to provide a nice windows frontend with interop story to existing Linux clusters. I see a great desktop supercomputing story. We have an 8 core desktop workstation on loan to a customer, running a demo Win xp 64 build. CCS on something like this is IMO a no brainer (assuming CCS looks lots like xp-64). It is a desktop that you can push hard, and run smaller jobs on. We have a few like this in the field, and our customers are ecstatic about them.
The thing that bugs me is the replacement philosophy, specifically in the claim that it is better. Not cheaper or faster, though there are implications (well fisked by now) that it is not cheaper. Faster is open for debate, my measurements are not encouraging right now. I may need to get the PGI for Windows to see if it helps.
The idea at the end of the day is that I am interested in targeting some of the accelerated apps, but I am concerned that I might wind up with 1 or 2 users. The issues they have to fight against for people to adopt it versus the competition (Linux) is the cost, and then their other platform’s momentum. Why would anyone want to port to the lower volume CCS versus the desktop? Which means that XP code (32 bit and happy with it) will exist for a long time. The cost of adding platforms is high, while the benefit (due to installed base size) is low. Careful stewardship of this, making effectively one platform out of many could help, but momentum is hard to break, and CCS is likely a tiny part of MSFT.
I have lots of ideas on how they could do a better job, sell more, and win business, but they seem to prefer to do their own thing. Don’t get me wrong, I like the people I have met in Microsoft. I just disagree with their marketing and approach, and think that there are better ways to grow their market in HPC. Infinite dollars or not, some businesses should not be pursued in the manner they are doing.

3. My concern with Windows CCS is that it if does manage to take over the way Windows desktop and server editions have, research on HPC technologies may suffer. HPC researchers have been working for years to improve OpenMP and MPI, as well as develop new languages, i.e. Unified Parallel C, High Performance Fortran, with corresponding run time systems and tools. In a few years, will there be a ubiquitous MS MPI implementation over Windows CCS which performs at a mediocre level, is closed-source (limiting potential improvement to MS developers’ efforts only), but is used by everyone? Will HPC research go the way of OS research, where the impact of new techniques is felt only by a small minority of users?

4. Joe – there are IT people, scientists, researchers, and engineers out there who are more comfortable with Windows compared to Linux. In fact, it’s a large market (I’m not telling you anything new). That’s why in less than one year in the market customers have bought thousands of nodes of Windows CCS [MSFT doesn’t allow me to be more specific, sorry]. In fact, one center is being built out right now. Read here: http://www.omaha.com/index.php?u_page=2798&u_sid=10081102
You are clearly not one of these people, and most readers of your blog. We know the market share of Linux OS within HPC clusters, and therefore we know the pre-disposition of these buyers. Admittedly, we’ve learned much more in the past 11 months once we had product in the market.
I will admit that not all Windows CCS “wins” are replacement of Linux clusters. That is a misconception. Most of the sales tend to be for new clusters (like Nebraska) where the Windows cluster runs along side existing Linux clusters (connected or disconnected). There are instances of replacement, but not the majority.
A new model we’re seeing more of are dual-boot, partitoned clusters. Where customers can choose to access either Windows or Linux (typically based on the app, or end user skills). This model presents an interesting revenue opportunity for service providers. This model is still developing and would make for a good conversation at SC07 in Reno.

5. Hi Joe,
Cost of win licenses may not be an issue at all. Why? All big orgs are covered by Enterprise agreements, campus wide licenses (including the unethical scheme of paying microsoft for OSX and Linux machines)
What worries me is the unethical practices MS is adopting: “funding” HPC customers to switch away from Linux. So I would not be surprised to see some big announcements…
The only thing we (HPC) community can do to stop MS entry is to substantially enhance Linux offerings and take the fight to desktops and servers

6. [quote]
Moreover, the tasks involved in managing jobs is IMO much much harder on CCS. Take for example, the process to list jobs on one particular node. Lets compare that with a linux cluster, shall we?
[landman@xxxxxxxx:~] 7 >qstat -f -t -q all.q@dualcore

Let me ask, which of these two is easier to deal with? Which one would be more accessible to an end user?
[/quote]
With all respect, if you have installed CCP toolpack, to get job on a particular node is
PS c:\> get-node | ForEach-Object { get-job -input $_ } ### I suppose this command line is self expressing ID SubmitBy Name Status Priority — ——– —- —— ——– 237 someuser pingjob Running Normal 237 someuser pingjob Running Normal You can also do, PS C:\> gn | %{gj -in$_} ### this is the same but not easy to understand for first timer
Alternatively,
PS C:\> mount s CCPProvider somescheduler
PS C:\> cd s:\nodes\somescheduler\237
PS s:\nodes\somescheduler\237> ls
ID Status Name CommandLine
— —— —- ———–
2 Running My Task ping -t 127.0.0.1
1 Running My Task ping -t 127.0.0.1