Our friends at Pervasive Software, showing Smith Waterman results

While I am pleased to see them in the news, and they are showing how their technology works well on a multi core system … I guess I am troubled by something. Maybe it was the run on the SGI Altix unit. You don’t expect many of those to be around. Maybe it was the 30x performance gap over the CUDASW++. Which suggests that 30 nodes with 2 Teslas each could match the results obtained, at a fraction of the cost, power, floor space, cooling budget.
I am glad to see that on a machine very few people have or can afford, they can show great results. Last I checked, they had some IO issues that were problematic … we could supply all of their IO needs and then some, from a single DeltaV box several years ago (e.g. it was slow, well under 500 MB/s). With lots of IO headroom left over.
I’d like to see them move to GPU and accelerator units. Unfortunately, as they are a java based system, and java isn’t likely to (especially now) ever see the light of day on a GPU or accelerator, they are fundamentally bound to the host substrate systems. We are seeing a strong design tendency to move these systems to be the support infrastructure for accelerators, and do the heavy lifting on the accelerators. When Sandy Bridge ships, I don’t really expect this to change, though it will have a whole lotta goodness for this sort of work itself in it. And I don’t expect Java to be on the leading edge of efficiently utilizing the underlying processor resources.
This said, I do wish them more success over time. They have a neat niche, and a cool tool, that takes an inherently slow language/platform and turns it into something nice. But it might be starting to deviate from where HPC is now and is going.