Woodcrest impressions

Ok, its about 2 weeks into the woodcrest experiment. I am starting to form opinions about woodcrest, where it is good, where it is hohum.
First off, woodcrest appears to give really good artificial benchmark. In some cases.

Artificial benchmarks are those which are not programs that people run every day with everyday work loads. Synthetic benchmarks are more of the “hundred kernels” variety. Both artificial and synthetic benchmarks have their place. Just not as meaningful predictors of application performance. Nothing models an applications performance better than itself. It is hard to conflate streams numbers to represent VASP, NAMD, LSDyna performance. With good reason. VASP, NAMD, and LSDyna do a really good job of enabling you to measure their own performance.
Second off, there appears to be no end of “doom for AMD” articles being pushed. I wonder if the authors of these articles are somewhat willing to bend to the marketing message from Intel.
Third off, woodcrest, at least the unit we have here, does not appear to be uniformly “just faster/better”. In my testing thus far (not paid for by anyone, though we have invited some OEMs to fund a formal study), I see that some very specific test cases, real world ones, we get some anomolous results.
The results you get are very code path sensitive, processor resource sensitive. If you hit precisely the right mix of processing and memory ops, with little contention for other resources, it appears, from the unit we have here, that woodcrest can be quite good, even excellent. Sadly, some of the codes that should have hit this right processing mix, such as our refactored Scalable HMMer do poorly on this processor, while doing well on Opteron. We are still trying to understand why this is. I would hate to think we have to refactor the code specifically for Woodcrest.
My expectations, given all the hype, have been a better opteron than opteron. I am just not seeing this in the general case right now, but this isn’t an exhaustive/comprehensive study. Anyone who wants to fund such a thing is welcome to. We would enjoy spending serious time working on this.

3 thoughts on “Woodcrest impressions”

  1. Just out of interest, what compiler have you used for your tests? In one of your posts you mention the Portland compiler, but is it already optimized for Woodcrest? What about the Intel Compiler (if a compiler is optimized for Woodcrest, it should be this one), have you tried it as well?

  2. So far I have used the Intel compiler (C), the PGI 6.1-6 (and -5, and -1). Would like to use PathScale as well.
    The PGI compilers do support the Intel chips fairly well. The PathScale has some of the best OpenMP support I have seen, followed closely by the Intel compiler. Had a little trouble with the PGI with OpenMP. Looks like there is a memory placement issue going on, though I haven’t had time to track it down.
    The OpenMP support is needed for streams parallel of course.
    Any other ones worth testing? I have played with gcc-3.4.x -3.3.x -4.1.x and a few others. Nothing much positive to report on perfomance there.
    ACML is BTW pretty good on Woodcrest.

  3. Hello,
    another question about compilers: can you check the Sun Studio compiler?
    And an idea: Woodcrest has a much better SSE support than the opterons. (can load 2-4 times the data in one sweep.) that could explain your mixed results.
    Heiko Wengler

Comments are closed.