4 for 4, and its not good

Ubuntu 10.04. 4 separate machines. All having some sort of nVidia card for CUDA/GPU work. All started from base desktop load.
All, every single one of them, unable to update to CUDA enabled drivers. Or even to the Canonical hosted non-CUDA drivers.
Get a black screen. On all 4 boxen. With vanilla loads. With very different motherboards.
I appear to be in good company. Few people can get the NVidia drivers working. There are some links, and yes, I have now tried most of what was indicated.
Our choices are to

  1. use nouveau, give up on accelerated graphics and GPU programming
  2. use a different distribution that doesn’t completely break this

We are opting for #2. My 9.04 on my laptop will go out of support in a few months (as it will on my various desktops). Thats when we will switch. We will evaluate until then.
I should note that Fedora 12 seems to have the same anti-NVidia disease.
I should also note that, given the direction of computing in general, moving rapidly to accelerator based systems … yeah … one could say that Canonical allowed ideology to get in front of practicality. I expect this from Fedora, as they have taken some rather bad decisions in the past and played with them as a beta for RHEL. I don’t expect this from a distribution that purports to provide long term support, first and foremost, by disabling functionality you have purchased and expect to be able to use.
Current contenders for our consideration include Centos + upgrades to Gnome and other bits, Debian, and a few others.
I should also note that we are displeased with some of their other decisions on the distro side, which make adding value atop their system far harder than it needs to be. We are redoing Delta-V’s OS load as a result. It will be no different in base OS load than JackRabbit.

5 thoughts on “4 for 4, and its not good”

  1. So you’re against customers requesting lock-in solutions in your storage business area but perfectly fine for their requesting lock-in in computation? CUDA is a single-vendor solution. Yes, it’s where much of the code is, but that doesn’t change that code written for CUDA relies on a single vendor.
    It’s fine (imho, and important) for a business like yours to compromise and cater to customers making these requests, but why do you expect others like Ubuntu to make the same choice? For a long-term support release, they don’t want to tie themselves to nVidia.
    Note: I detest Canonical for other reasons. And the free driver doesn’t handle OpenCL either. I’m just focusing on your desire for CUDA as opposed to Canonical’s desire to mitigate the single-vendor risk of relying on nVidia supporting *all* their currently supported cards across their LTS lifespan.

    • @Jason
      I can’t say I understand the leap you made w.r.t the lock-in. It is my choice to run CUDA or OpenCL based systems. I think it should remain my choice as to whether or not I can use them. Currently CUDA is the market leader, and by the looks of things, this won’t change for a while. This is what a competitive market will bring to bear. Whether or not CUDA is the most appropriate platform is to a degree less relevant than if it establishes ubiquity. The latter, it is well on its way towards.
      Do I expect Ubuntu to tie its releases long term to another vendor? No. Do I want them to give me the choice to do this myself, without throwing up nearly insurmountable roadblocks? Yes.
      All I want is a nice little “Install Nouveau, or leave blank to install other graphics drivers”. Or something like that.
      Unfortunately, this directly flies in the face of what they are doing relative to a consistent look and feel.
      I don’t despise Canoncial. I think they made some unwise choices.
      I can say that 1 of 5 worked (after I posted the 4 of 4 failing). The one that worked, did so on a special motherboard. Which leads me to believe that they have not just problems on the module side, but problems on the kernel side. That unit wasn’t stable with the vesa, nouveau, or NVidia drivers.
      As I noted, I am not pleased by this. Given my current laptop is a Dell NVidia quadro FX mobile workstation, and my next one will also have NVidia graphics, this is very … very important to me.
      On our storage, we will tell people when we think they are making a mistake. But it is not for us to disable them from making the mistake.

  2. It also isn’t for you to support every possible user choice. That’s Canonical’s view about the nVidia proprietary drivers. They’re not spending effort testing those drivers because they don’t support that choice. I highly doubt if they’re intentionally blocking them. They’re just not testing against those drivers. You want them to keep an unsupported option in their design and testing.
    I can see why they wouldn’t want to tie their testing to the nVidia drivers. nVidia has a history of making unannounced changes.
    The other quality issues… well…
    (My note about despising Canonical is about *my* views; I didn’t mean to imply that you might have that view.)

    • @Jason
      Hmmm …. I see this argument wandering a ways from its original base. My contention was that Ubuntu decided not to offer the capability to do something that a fairly large number of people want to do. Moreover, their decision appears to be based less on market realities (e.g. popularity of NVidia hardware, massive growth/use of CUDA, etc), and more on ideology (hey, lets test this new driver now so we can abolish the evil closed source driver bits).
      I’d be happy to be proven wrong, and see that it was simply a testing issue on their part as you now suggest.
      Given how popular NVidia hardware is, and how ubiquitous it is, the likelihood that this wasn’t in their test matrix? Well … this isn’t a great argument.
      I fault them for making choices which make it harder to use the platform they want us to use (which apart from a few other jaw-droppers like grub2, is generally OK). I fault them for not considering the impact of their decisions.
      Do I want them to keep well used hardware in their test matrix? Generally, yes. Do I want them to make sure their new-fangled bits don’t break hardware? Generally, yes. Does this mean that paying attention to drivers that might not be open source, but happen to be popular? Quite likely.
      Is this beyond the pale or beyond the scope of their actions? Looking at what they ship now, and specifically, that they do in fact ship licensed codecs as binary add-ins, which are not open source.
      Does nouveau make an end users Nvidia experience better or worse than the closed drivers? I haven’t seen a case of better yet, I’ve only heard of problems, of apps not working (usually things like compiz, Gnome-do, etc), crashes, …
      Do the NVidia drivers result in a better or worse experience than the open drivers? Generally, we see they are better. We don’t expect this to change for a while (time scale measured in years).
      They had to make a choice. They currently include non-free drivers and codecs. They seemed to ignore the popular one though.
      So I don’t buy the testing matrix argument.
      The higher risk path, for the end user, is to go with the default choice they made for NVidia drivers, without any realistic hope of getting operational drivers onto the system. It damages the user experience, it damages the functionality available. How precisely, is this an improvement, apart from ideological purity? Sure, you can claim moral high ground with this position. Even get bonus points for sticking it to the “evil” closed source driver maker.
      FWIW, we haven’t seen much problem in the NVidia drivers over the past several years on many different platforms. I am unhappy that NVidia dropped build support for Itanic, as we have an Itanium 2 in our lab which has been powered off about 1 year now, with a nice NVidia card in it, and no way to run it with a modern Linux distro. Thats about my only real complaint on their closed source driver.

Comments are closed.