The tipping point for APUs

By joe

October 17, 2009 - 7 minutes read - 1453 words

This news item on InsideHPC made me smile. In short, the HPC application vendors do see the value in decreasing the cost of hardware for their HPC users. It keeps more money available for end users to purchase licenses, even in the face of declining budgets. There are other problems, such as the software license cost now being substantially higher than the cost of the hardware to run the HPC codes on, but that is another problem. So if you have a 128 core cluster that is about as fast on your code as a muscular desktop with 3-4 GPU cards, which is more expensive to procure and run over time? I am not talking about “leadership class” HPC, where you have 10000 cores available to your jobs. I am talking about the emerging everyday HPC. This is the computationally intensive analyses being done on the aforementioned muscular desktops and smaller deskside clusters.

This doesn’t mean that leadership class systems are going away. On the contrary, as we have been predicting for quite some time, as our accelerator business plan had hypothesized (for next year and the year after at that!), the accelerator processing units (APUs) appear to be taking off, with a vengeance. What strikes me in this is that once the ISVs determine that adding a platform makes sense (for all the investment and long term support costs this engenders), that they themselves are viewing this as a way to grow their market. Basically, for a port to occur, they have to project real increases in profit after subtracting off all the costs. This is also why, if a non-volume vendor stops paying the support/porting costs, the ISV tends to quickly drop their platform. We figured that (unlike the vast majority of social media type businesses out there that VCs have been largely blowing their LP’s money on), that there was a real hockey stick growth and opportunity for accelerators in HPC. Its gratifying to see it evolve pretty much exactly as we had thought, and as we wrote in our power points several years ago. Imagine this. You take a real $10B USD/year market, and say, I dunno, 5% goes APU. This is $500M USD. Despite the dollars recent declines, this is still a real market. What APUs do is decrease the necessary size of expenditure for end users. Why by a cluster if you don’t really need it? It will lower your cost and complexity to run a single machine than it would to run a cluster. This is curiously, a similar message to what ScaleMP offers for small/mid-sized units. They will aggregate smaller units into a simpler to manage larger unit. This lowers your management complexity. This reduction in management complexity is driving other business ventures as well. Some are, well, not likely to survive long. The market for cluster management systems is between Rocks, Perceus, and other things that are free. APUs offer a reduction in scaled hardware management complexity. APUs offer a reduction in power consumed per unit time. APUs offer a reduction in floor space. APUs offer a reduction in cost per result obtained. Put another way, it is a no-brainer that they would eventually take off. And ISVs have noticed. The first we heard about serious ISV interest was 3 years ago in some discussions we had with a few ISVs on this. APUs offer the ability to expand their installed base. Not everyone needs a large cluster to run their code quickly. This lets the ISV increase their installed base by serving more customers, as the barrier to start using the expensive code has changed from an expensive machine to an inexpensive machine. The smart ISVs are going to combine this with a pay per use model (aka “micropayments"). This opens up the wider adoption of their software, which increases their installed base. Some will argue that tokens provide this. I disagree. I think most users do as well. Ask your typical HPC user about their proprietary software, and specifically, ask them if the licensing has ever given them grief. Odds are that in most cases you will hear a few horror stories, and learn of the design decisions that users have made to minimize the pain that the licensing schemas in use generally cause. This simultaneously allows their end users to buy time on demand from providers like Tsunamic Technologies, who will host the applications, and possibly help process usage statements, charging for application usage. That is, this is, IMO not simply another platform for the ISVs. This is part of a long term strategy on their part to increase their market size. The leading ISVs are going to be driving their applications there. The open source apps are already going. We have heard from a few folks starting to work on a number of apps. Basically we have reached the point where ISVs and OSS app providers have seen the value of moving to an APU based platform going forward. This is a strategic move. Not one likely to be reversed. In the APU race, GPUs currently have a long lead, and in the GPU space, NVidia has effectively won. I am not hearing of much in the way of GPU ports to ATI … I am sure there are some, but NVidia is pretty much dominating this space with a usable SDK available for free, partners delivering systems (we do this with our tightly coupled storage and processing units), and a growing list of ported applications and wins. In the APU space there are also Cell units, and FPGAs. I know some of my fellow bloggers are firmly embedded in the FPGA space, as are some of our partners. I don’t expect this space to ever take off. I thought it might at some point if the tools became affordable. This has never happened. It doesn’t look like it ever will. Cell, while a great technology, one we resell on some of our desktops, does not look like it is a growth platform. The accelerator costs are 3x the GPUs, and are somewhat harder to program than GPUs. The upside is that it is a completely separate machine. The downside is that, unlike GPUs, the Cell as an APU has been 2 or more orders of magnitude less in terms of shipped/deployed systems. If you asked me two years ago, I would have guessed Cell and GPU fighting it out for a lead, with similar cost hardware and development environments. Specialized co-processors such as ClearSpeed never really had a chance in the general market. There are a few others. The issue that will kill any accelerator before it starts to get traction is economies of scale, cost to adopt, cost to deploy. Right now, Nehalem cores can do 4 DP FP operations per clock cycle. At 3 GHz, this 12 GFLOP per core in DP. 4 cores per chip puts this at 48 GFLOP. Two chips per server puts this at 96 GFLOP. Normal application code will use 1/10 to 1/4 of this capability, unless your code spends all of its time in hand tuned assembly language routines that make effective use of the resources. For an accelerator to be meaningful, it has to deliver ~10x the performance of the full platform (not just a single core), without correspondingly increasing its price by 10x. You will get 10x for free by waiting 5.5 years or so, just from the technological trajectory of Moore’s law. Current NVidia product delivers about 5x in single precision versus the underlying substrate platform they are plugged into. But after a little work, most folks get 10x fairly easily on the platforms. With some rearrangement of the code, you may be able to get to 100x or more. GPU-HMMer gets to 112x in some cases. So why would anyone go back? This is not lost on the ISVs. They see that adding computing power just got real cheap. Its not lost on tool vendors like PGI who are hedging their bets with their compiler platforms. Technologically, it is an excellent hedge against underlying APU changes, it decouples the programming of the accelerator from vendor specific tools. This enables them to keep source code compatibility, even if the APU changes. And they tout that. And with some clever compilation and linking, they can even do unified binaries that will run correctly on APU-ful and APU-less systems, without recompilation. APUs are very much in HPCs future. As Clayton Christenson suggested, APUs are going to destroy something, and create something even better in their place. John West has a story today about disruption in the HPC market. APUs are definitely providing this disruption. We have reached the tipping point.