I recently upgraded the NVIDIA driver from 515 to 535 (tried 525 as well) and noticed the bug while running clinfo and one of our applications using the new CUDA 12.
clinfo and the application are run inside a docker container. These issues are not there when running natively.
This is the relevant
Platform Name NVIDIA CUDA Number of devices 1 Device Name NVIDIA RTX A4000 Device Vendor NVIDIA Corporation Device Vendor ID 0x10de Device Version OpenCL 3.0 CUDA Device UUID 6fab8f10-ccd0-3111-3f1a-e8ea712c3184 Driver UUID 6fab8f10-ccd0-3111-3f1a-e8ea712c3184 Valid Device LUID No Device LUID 0000-4000636c5f6b Device Node Mask 0 Device Numeric Version 0xc00000 (3.0.0) Driver Version 535.86.05 Device OpenCL C Version OpenCL C 1.2 Device OpenCL C all versions OpenCL C 0x400000 (1.0.0) OpenCL C 0x401000 (1.1.0) OpenCL C 0x402000 (1.2.0) OpenCL C 0xc00000 (3.0.0) Device OpenCL C features __opencl_c_fp64 0xc00000 (3.0.0) __opencl_c_images 0xc00000 (3.0.0) __opencl_c_int64 0xc00000 (3.0.0) __opencl_c_3d_image_writes 0xc00000 (3.0.0) Latest comfornace test passed v2022-10-05-00 Device Type GPU Device Topology (NV) PCI-E, 0000:00:00.4 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 48 Max clock frequency 1560MHz Compute Capability (NV) 8.6 Device Partition (core) Max number of sub-devices 1 Supported partition types None Supported affinity domains (n/a) Max work item dimensions 3 Max work item sizes 1024x1024x64 Max work group size 1024 Preferred work group size multiple (device) 32 === CL_PROGRAM_BUILD_LOG === Preferred work group size multiple (kernel) <getWGsizes:1504: create kernel : error -45> Warp size (NV) 32 ...
error -45 is not present when using NVIDIA driver 515 or older, and just changing the driver version with everything else constant gives this issue. I believe there is a correlation between this error and the issue encountered in the application with build program (
Failed to compile OpenCL program - Error: -11). When running the application, the program build log is empty.
I have used different OpenCL packages available (
cuda-opencl available in NVIDIA repo and
libOpenCL1 available as part of openSUSE main repo) and still see the same output.
libnvidia-opencl.so seems to be different between the two driver installations so I am not sure if the issue lies there. Any help is greatly appreciated.