CUDA not ready for prime time How can anyone justify CUDA for corporate development?

While the potential of the NVIDIA GPU’s is impressive, I have some major concerns about investing my and my colleagues’ time and effort into rolling out the CUDA technology in our firm.

[list=1]

[*] The CUDA development software (NVCC) is not supported by NVIDIA. I would like to hear someone argue otherwise. The forum is loaded with posts of people who have experienced problems with NVCC, very few of them are answered, let alone addressed. There is no channel by which a user can pay NVIDIA for extended, direct one-on-one support, whether by phone or email.

[*]The CUDA software is not open-source. If NVIDIA does not support the software, what about the user community? NVIDIA has not released the source code for the NVCC package. The only source code which has been made available is NVIDIA’s modifications to Open64 (nvopencc). There is no current or correct documentation for this package, and as far as I can determine, it cannot be built on Windows in its current form (using cygwin, mingw, or whatever you propose). Again, if someone can argue otherwise, I would love to hear it.

[*] There are almost no CUDA debugging tools. Yes, I know that NSight is in beta (with no release date announced), but there are no debuggers for Linux, and the Emulation libraries are deprecated and slated to be removed. Also, NSight is not supported on Windows XP or on Visual Studio 2010.

[*] There are almost no CUDA optimization tools. The optimization strategy for CUDA, and GPU’s in general, seems to be “try it; change it; rerun it until it’s fast”. The profiler is not useful for trying to understand what is happening inside the processor.

Contrast this situation with Intel, who not only has a well supported software product (Intel C++ Compiler, Threading Build Blocks) but also provides excellent software support through their website. I understand that Intel C++ is a purchased product (as opposed to the free NVIDIA CUDA tools), but I would be certainly willing to pay NVIDIA for their software if it would make it usable in a corporate environment.

The alternative for GPU’s is of course OpenCL, which has a smaller user community, no debugger, and apparent performance issues.

While the cost of hardware in $ per GFLOP is much less for the GPU’s then for multicore CPU’s, it is not clear that this cost savings will offset the many weeks of trial and error to get our software running.

I would like to hear other opinions on this.

  1. The official support mechanism for the software stack is not the forums; instead it’s via the registered developer site. Bug reports are usually answered fairly quickly there. I post here because of the broader audience, but my day job is Tesla driver development–I’m not a support guy or anything like that.
  2. I’m not sure why open source is necessarily a mark of quality.
  3. on Linux, cuda-gdb has existed for what, a year and a half now? since the 2.1 beta, if I remember correctly. It works extremely well, especially on GF100.
  4. Feel free to file RFEs on the registered developer site if you have suggestions on how to improve the profiler.

Not going to argue at all with this. Seems true.

I guess what you describe is the price for any cutting-edge tech.

Any breakthrough gets us (the developers) back to the stone age of (near) assembly language and ugly hacks all over the place to get the stuff to work at all.

Depends on what you do and how much would it benefit from CUDA. But no risk, no gain.

Advice? Buy a GTX 480, implement (hire a CUDA dev for a month or so?) a few benchmarks specific to what you’re doing and see how fast it works.

Edit: quote added

The form there has a “Company” field which is required and the whole thing looks like only-for-corporates. Does it actually mean that if I’m just “some guy” I have no way to submit an otherwise valid bug in a formal way?

Well, the “registered development program” is apparently by invitation only; after you apply, you are notified “in a few weeks” if you have been accepted. This falls short of the type of support that we would expect from a software vendor.

I’m not at all claiming it’s a mark of quality. I’m simply pointing out that, unlike the situation with linux or svn, there is no way for sophisticated users or experts to assume the support role themselves, if this was NVIDIA’s intent.

Agreed – this was a misstatement on my part. One alternative that was suggested by one of my developers was to run linux on VMWare, and do our development there; but VMWare currently does not support drivers which let you talk to the GPU within a virtual machine.

I’ll be happy to do that once we (a) get the compiler working, and (b ) get “accepted” as a registered developer!

Yeah, sounds like the cheapest thing for you to do is hire a CUDA consultant who understands the current state of the technology to analyze your problem and help you decide if CUDA is worth investing in. If you aren’t sure if CUDA is a good fit, and your time is worth that much, it seems like the only sensible solution. Let the consultant deal with NVIDIA.

The registered developer application should really take less than a few weeks now (should be less than a day, really)–let me ask around and see what is going on there.

Not sure why a consultant wouldn’t face the same problems (unless they worked for NVIDIA). Also, at some point we are going to have to bring this technology in-house, not to mention issues with trade-secret code, etc.

If you haven’t applied for the registered developer program yet, you should–not sure why that disclaimer is there, but it really shouldn’t take that long to get things done.

OK, fair enough. The consultant came to mind because the vast majority of support that developers appear to need in the forums relate to inexperience with CUDA rather than problems that require intervention from NVIDIA to solve. CUDA is dissimilar enough from typical multithreaded/multicore programming (despite using similar terms) that the biggest obstacles are: (1) understanding the programming model and its range of applicability, (2) understanding the details of how to write an efficient CUDA program, and (3) figuring out how to map your problem to CUDA, if possible. An experienced consultant can get you over those hurdles quickly and needs no input from NVIDIA.

Every once in a while someone stumbles over a serious toolkit bug, and that certainly needs a direct response from NVIDA. Toolkit bugs seem to be addressed quickly, but the typical response after the bug is fixed seems be “wait for the next release” (possibly a beta). While that is usually fast enough turn around for research (we don’t have the budget to expect anything faster), it certainly is not as fast as a nice support contract.

You shouldn’t assume that “wait for the next release” is the answer we give to Tesla customers, either.

Oh, that’s good to hear! I’ve been in the GeForce ghetto too long, clearly. :)

Basically, Tesla = support. If you buy Teslas and you find bugs, you’ll get the fix a lot faster than an academic on a GeForce reporting bugs via the forum.

We bought two Tesla C1060’s with the intent of doing exploratory development, to see if we should be investing in the Tesla server products. If there is a mechanism for CUDA support other than the “registered developer program” (which we are still waiting for), I am not aware of it.

You need profit. Makes sense, but for a consumer also sort of invalidates the approach of “tasting” the technology before risking too much investment into it. All or nothing, it seems.

At least I have a clear picture now, thank you!

Eh? You can certainly try things out, develop software, etc., there’s just a different level of support involved.

I meant I know now why I can’t expect support for my bugs, that’s all :)

I’d be the “some guy” at the bottom of the ladder. I have 2 electrical engineering degrees, and 25 years in technology - so its in my blood. I’m working on a hobby project rehoning my technical skills dulled in the management rat race. Its an aggressive project suited for high performance computing and sleepless nights. I figure I’m in the same boat as a junior engineer doing hpc development at a company. I probably put in 50hrs/wk in my spare time studying everything I can find and I’m still drinking from the proverbial fire hose. I’d be surprised if most junior engineers could be “productive” in less than a year. The theory of consulting is shaky. When the consultant leaves you need a staff of proficient doers. I dont see the mechanisms to efficiently teach this stuff. Criticize me if you want, but this is a hodgepodge of highly technical powerpoint presentations. Thats my impression. I have time to slog it out, but its a hard sell for most companies. I see a lot of potential in the field, but the learning curve is extremely steep. You have to have a solid grasp of hardware architecture, OS operation, debugging expertise, and assembly/C - to name a few. Show me a stack of those resumes and I’ll have a stack of liars.

I have to agree with the OP. The business case is going to be hard on this one.

OTOH, the OP mentions Intel’s parallel product suite - perhaps implying it is better suited for a commercial developer. I’ve evaluated both technologies for 2 months now and my initial (probably way underqualified) impression is they are significantly behind. I really dont find much more than a line of products, less technical documentation, and whisper of activity on their various forums. The marketing hype is high, but I sense a lack of substance the more I dig. Maybe I’ve missed something. Relative to Intel, I see the CUDA platform well ahead. My initial impression of the debugging tools was really disappointing…cant even remember the name of them. I’ve been much more impressed with Ocelot.

I dont see how to make a business case for anything but a product with a 5 year vision that absolutely relied on computing speeds at orders of magnitude faster than they are today. I can see how this technology could become a significant competitive advantage. But the vision better be clear because theres a lot of time and investment on the front end. Whoever is mastering this technology (Nvidia and others) isnt going to give it away…no way. This $it is hard.

One more comment about Intel. Lets get serious…what segment does Intel lead in other than CPU’s. Thats a lame argument IMO. You think because you put Intel’s name on it that its the best choice? Not buying that one. But thats a personal opinion…freely given and worth as much. I think the better question is do you see the CPU being the source of scalar apps…or the GPU. I would never have guessed at anything but the CPU 2 months ago. However, I’m beginning to think Tesla and GPU’s may be on to something much more promising than a 10Ghz 2nm 60Gb L1 cache CPU.

Al

This is starting to veer way off topic (and I don’t work in industry), but in the context of CUDA, the most valuable consultant is more of a teacher than a mercenary. I’ve informally “consulted” for several of my colleagues over the past few years. We basically sit at a whiteboard, I explain the basic idea of CUDA, point out the common conceptual pitfalls, listen to them explain the problem they want to solve, and propose a few possible approaches. They are not CUDA experts when we’re done talking, but hopefully I’ve cut a few weeks off their learning curve and preempted a few development dead-ends in the process. I would expect a paid consultant (I was free) to then jump into the trenches with your staff developers and continue to do some hands-on teaching while helping you solve your problem. (As I said, I am quite naive, but if I were writing the checks, that’s what I’d want.)

Anyway, I’ll bow out of this conversation since I have no experience with the support that comes with Tesla cards. :)

Agreed. While a consultant can be useful to bring a technical team up to speed with architecture, techniques, tuning, etc., I’m thinking more about what happens when you install a new critical patch on your Windows 7 box and NVCC stops working? Do your highly skilled developers sit around for a few weeks waiting for an answer?