What are NVIDIA's plans with regard to OpenCL?

On various other forums I’ve seen mentions that NVidia is trying to achieve as much vendor lock-in as possible by promoting CUDA and completely halting any efforts to support newer OpenCL versions, and so far as I can tell from the 1.2-back-to-1.1 OpenCL controversy in GeForce 700 series, that’s pretty much on-the -dot.

I have great respect for the NVIDIA team and their drive for innovation, ever since my fist graphics card - the GeForce FX 5400. Now it seems like the company is more interested in churning a buck then really supporting the GPU-computing dev community, and just as AMD has shown better performance/dollar with the OpenCL framework vs. CUDA.

Now, IMHO, CUDA is way easier to use and has more features that OpenCL 1.2, but I can’t use it on my AMD cards. Should I just switch to all AMD cards now, since they DO support higher versions of OpenCL? (oh yeah, and Intel supports 1.2 too, a Microsoft ally just like NVIDIA)

Can someone out there please tell me what is NVIDIA’s official position on OpenCL?
Should I start a petition for them to support OpenCL 1.2, or is it completely useless?

can you prove this with tests (besides the often-used bitcoin mining comparisons from 2 years ago).

Please submit your running times for four objective tests on any single AMD GPU:

  1. sorting an array of 67,108,864 doubles.

  2. Floyd-Warshall on an dense adjacency matrix of 10,000 x 10,000. You can use my implementation which should be about the same in OpenCL(which has full path reconstruction).


  1. BLAS GEMM on matrices of size 6000x7000 by 7000x6000, 32 or 64 bit floats

  2. your choice of some simple brute force algorithm, permutations/combinations where the problem space is over (2^31)

All companies want to make profit, as that is their purpose. AMD is just not very good at that aspect. Yes, AMD GPUs tend to be cheaper, but now with the $150 GTX 750 that gap may have narrowed.

Just out of curiosity do AMD GPUs have similar functionality to Nvidia’s Hyper-Q feature, dual-copy engines, or __ldg() ?

I have mad respect for the fact that AMD GPUs have fast integer operations, but that is only a small part of the picture. For games the GTX 780 (a single GPU) still beats the 290X by a large margin:


NOTE: Not interested in a prolonged debate about which GPU is ‘better’, just would like to see the running times on a 290X for the above. So far no AMD user has answered this call(and include source code, as I always do).

More to the point of your question:

I have not seen NVIDIA mention an official position on OpenCL support, but I think their actions are pretty clear: they aren’t very interested. It is of little practical concern whether they are ignoring OpenCL because of a desire to increase vendor lock-in, or because they want to innovate more rapidly in the space of GPU programming models than OpenCL will allow them. It could even be some of both. Either way, the outcome is the same.

I’m inclined to believe a petition is going to be useless on principle. However, if OpenCL is important to meet your application needs and AMD cards are sufficiently fast, you should definitely stop purchasing NVIDIA GPUs and publicly explain the reason. Lost sales do matter to NVIDIA, although only in proportion to the amount of money you decide not to spend on NVIDIA stuff. :)

In practice we’re seeing people use OpenCL for AMD GPUs and CUDA for Nvidia GPUs.

One reason is the subpar performance portability between different GPU vendors hence the needed optimization for each platform anyways and furthermore porting between OpenCL and CUDA is quite straightforward for most kernels not utilizing advanced CUDA features (OpenCL appears to trail behind and adds CUDA similar features maybe 2-3 years afterwards).


  • Microsoft is pushing C++ AMP.
  • Intel wants you to use CILK or old-school OpenMP. If you talk to their tech. people they rarely recommend OpenCL (“it’s too hard”-sales argument).
  • Google is pushing rendersript.
  • Nvidia is pushing CUDA and has forgotten that OpenCL ever existed.

When you break it down, we don’t really have good cross platform options, and everyone is trying to sell you their own cool-aid drink.

Now if the cross-platform performance portability and the development support (you can’t debug OpenCL on NV GPUs anymore AFAIK) if very poor, what is my argument for not using the best tool possible for each respective hardware platform? Hence people end up using OpenCL om AMD GPUs and CUDA for Nvidia GPUs.

The problem is you are trying to get the best of both worlds, when those two worlds are by definition competing. OpenCL is a high enough abstraction to work everywhere, however performance will suffer as a consequence due to this abstraction layer and generalization. CUDA is closer to ‘the metal’ and can be optimized so will perform better on nvidia GPUs.

You can have the extra performance boost of using something specific to the hardware that you are using, or you have the extra flexibility of using a general solution that works anywhere. Pick one.

If you look in the graphics world, OpenGL and DirectX have been the goto for rendering and yet AMD have seen fit to bring out a ‘closer to the metal’ option to provide that extra performance boost with Mantle. It is no different really that even on OpenCL you still end up tweaking it for your specific setup.

Thank you all for the answers*,

I appreciate the input. @seibert, I think you’re completely correct that NVidia might be deferring OpenCL updates because they’re more into designing things, like @Tiomat said, “closer to the metal”, and they can only do so with the GPGPU part of CUDA. @Tiomat, I think you’re right about the essential difference between OpenCL and the rest of CUDA, yet I still believe that NVidia should be worried about both, since they don’t want to lag behind the community in cross-platform stuff either.

@seibert, sad but true - a petition might be utterly useless if NVidia doesn’t care about how devs feel about it. But if there’s enough support, at least they’ll be aware of public opinion. In terms of financial gain, perhaps they’ll want to reconsider if they hear that we’re currently building a new CPU/GPU cluster and might just want to go with AMD for now due to better OpenCL support.

@Mr. Pattersson, completely true. There’s always some lag between DX and OpenGL, CUDA & OpenCL. And yes, everyone’s pushing their own thingy alongside the community-accepted standard. But do we want to live in a world where MS produces ALL operating systems? Or in a world where NVIDIA produces ALL graphcis cards? I’m sure that’s what Microsoft & NVidia must want, but, unless you’re working there, you’d get the short end of the stick, it seems.

I have high hopes for C++Amp, btw, ever since the Clang compiler started adding support for it. I hope it becomes the next cross-platform tool instead of OpenCL, because yeah - that doesn’t quite cut it.

  • except @CudaaduC - It seems like you didn’t really answer my question, but if you’re interested in fair benchmarks, perhaps you can start one of your own. I can only claim to believe that 4 higher-end AMD cards have much higher raw floating-point throughput then a single higher-end Tesla, while costing roughly the same.

I think it’s pretty clear by looking at FFT/BLAS performance. Nvidia could easily optimize the same libraries in OpenCL, but they don’t.