CUDA / OpenCL 1.1 Comparison

Are there any comparisons on the web? I like to know the differences in terms of feature sets.

I have done tests with both CUDA and OpenCL on the same algorithms. When run on a CUDA GPU, the speeds of (rather simple though) cases are virtually the same. What I find nice about OpenCL (which reminds a lot of the CUDA driver API), that you could take the code without ANY modifications and run it on a CPU using the AMD OpenCL SDK. I use AMD OpenCL on a machine with CUDA cards in them and it happily coexists with CUDA. One could choose at runtime, whether to use the CPU or the GPU.

Currently I have a HD 5850 but I like to add a Fermi card to evaluate CUDA. Are you working with Windows or Linux?

I am mostly working with Linux (OpenSuSE) and the tests that I mentioned were done under Linux.

However I have also run some production code (for a microscope manufacturer) both under Linux and Windows. That code was initially written in CUDA and then ported to OpenCL. Tested only on NVIDIA GPUs and there the CUDA and OpenCL performance is essentially the same. Not surprising, as the generated assembler code (PTX) is more or less the same.