NVIDIA OpenCL SDK deployment so 90ies

And I know nineties!

Trying to get my feet on the ground with the Nvidia SDK. Except segmentation faults, compiling errors, tons of warnings and bad docs there’s not much fun yet.

Example given:

# less /opt/nVidia/OpenCL/Samples.html

<head>
        <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <title>NVIDIA OpenCL SDK Code Samples</title>
...

Lets have a look and browse that file with firefox (firefox Samples.html):

file:///opt/nVidia/OpenCL/Samples.html

Nice. Let’s click on some “Browse files” links in there:

d:/bld_sdk10.pl/devrel/SDK10/Compute/C/src/inlinePTX BOOM

Maybe it’s the “Last Update: 4/30/2011”

Ok. Let’s try to compile something. After having adjusted gcc to be actually gcc-5 (instead of gcc being 6.2 here)

# /opt/nVidia/OpenCL> make

...
cc1plus: warning: command line option ‘-Wimplicit’ is valid for C/ObjC but not for C++
In file included from ../../..//OpenCL//common//inc/CL/opencl.h:44:0,
                 from ../../..//OpenCL//common//inc/oclUtils.h:26,
                 from oclTridiagonal.cpp:42:
../../..//OpenCL//common//inc/CL/cl_gl_ext.h:38:4: warning: "/*" within comment [-Wcomment]
  * /* cl_VEN_extname extension  */
    ^
In file included from oclTridiagonal.cpp:45:0:
test_gen_result_check.h: In function ‘void test_gen_cyclic(float*, float*, float*, float*, float*, int, int)’:
test_gen_result_check.h:105:51: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
         file_read_array(a, system_size, "a256.txt");
...(like 50 times)...
sweep_small_systems.h:40:102: warning: unused parameter ‘kernelTime’ [-Wunused-parameter]
 double runReorderKernel(int devCount, cl_mem *dev_a, cl_mem *dev_t, int *width, int *height, double *kernelTime)
                                                                                                      ^
In file included from oclTridiagonal.cpp:61:0:
sweep_small_systems.h:96:120: warning: unused parameter ‘dev_t’ [-Wunused-parameter]
 double runSweepKernel(int devCount, cl_mem *dev_a, cl_mem *dev_b, cl_mem *dev_c, cl_mem *dev_d, cl_mem *dev_x, cl_mem *dev_t, cl_mem *dev_w, int system_size, int *workSi
                                                                                                                        ^
sweep_small_systems.h:96:146: warning: unused parameter ‘system_size’ [-Wunused-parameter]
 double runSweepKernel(int devCount, cl_mem *dev_a, cl_mem *dev_b, cl_mem *dev_c, cl_mem *dev_d, cl_mem *dev_x, cl_mem *dev_t, cl_mem *dev_w, int system_size, int *workSi
...
make[1]: Entering directory '/opt/nVidia/OpenCL/src/oclHiddenMarkovModel'
cc1plus: warning: command line option ‘-Wimplicit’ is valid for C/ObjC but not for C++
In file included from ../../..//OpenCL//common//inc/CL/opencl.h:44:0,
                 from ../../..//OpenCL//common//inc/oclUtils.h:26,
                 from oclHiddenMarkovModel.cpp:12:
../../..//OpenCL//common//inc/CL/cl_gl_ext.h:38:4: warning: "/*" within comment [-Wcomment]
  * /* cl_VEN_extname extension  */
    ^
oclHiddenMarkovModel.cpp: In function ‘int main(int, const char**)’:
oclHiddenMarkovModel.cpp:140:12: warning: variable ‘szWorkGroup’ set but not used [-Wunused-but-set-variable]
     size_t szWorkGroup;
            ^
... another 1000 lines spared

Ok, after lots of vomiting it runs through. Not sure if I should attempt to execute the resulting binaries but let’s try…

Ah - wow - 20% of the binaries in /opt/nVidia/OpenCL/bin/linux/release do even run without error.

For the rest it is

/opt/nVidia/OpenCL/bin/linux/release/oclMedianFilter Starting (Using MedianFilter.cl)...
...
 !!! Error # -11 at file oclMedianFilter.cpp, line 298

-----------------------------------------------------------

Build Log:
<kernel>:51:72: error: call to 'mul24' is ambiguous
            uc4LocalData[iLocalPixOffset] = uc4Source[iDevGMEMOffset + mul24(get_local_size(1), get_global_size(0))];
                                                                       ^~~~~
cl_kernel.h:3285:22: note: candidate function
int __OVERLOADABLE__ mul24(int, int);

or

/opt/nVidia/OpenCL/bin/linux/release/oclHistogram Starting...

 !!! Error # -5 at line 119 , in file src/main.cpp !!!

Exiting...

and similar. Also cutting it off here to spare the fellow reader another hundreds of lines of misery.

Not sure what I am supposed to learn from this. Of course I am frustrated after a week poking around and nothing coming out of my nvidia. But I’m sure someone here will again explain to me how all these experiences of nVidia software quality is in fact my fault and my fault only.

If nVidias “policy” is in fact to just pretend OpenCL support while in fact throwing 99% of its effort on CUDA, please state so (like real men).

I realize this is perhaps a rant, so I won’t try to address all the points raised.

Yes, NVIDIA puts more effort into CUDA than OpenCL. I don’t think this is news. I don’t think you have to go through this level of analysis to discover that or arrive at that conclusion.

The proximal reason for the difficulty you’re describing here is just as you’ve stated: the codes you’re touching are quite old, and have not been updated. Toolchains have moved along however, so things don’t work the way they used to under the latest toolchains.

It is possible to reconstruct and compile the original samples in an orderly fashion, if one is so motivated. A description of how it might be done is here:

[url]https://devtalk.nvidia.com/default/topic/768502/opencl-example-code-doesn-t-compile-cuda-6-0-ubuntu-12-04-5-/[/url]

Note that the RHEL 5.5 I used there is old, and so it includes GCC 4.1.2 which helps if you’re trying to compile these old codes.

AFAIK NVIDIA no longer maintains a “current” OCL SDK. When OCL was brand-new (circa 2010), there was some value in that. Nowadays, there are a great many OCL samples and resources available, it generally should not be necessary for developers wanting to use or learn OCL to depend on NVIDIA provided samples.