CL_INVALID_KERNEL_DEFINITION bug, when calling clCreateKernel() with multiple GPUs


I believe there is an nVidia Driver bug; could an nVidia developer please investigate and fix it?

Is there an nVidia Driver bug in the CUDA/OpenCL call to function clCreateKernelsInProgram() or to function clCreateKernel(), which can invalidly return error code CL_INVALID_KERNEL_DEFINITION, when more than 1 type of GPU is in the system? Possibly also dependent on the number of constants used in the program? Also, might be a possible overrun based on one of the links below.

Basically, I’m trying to identify the cause of a situation where, in my system that has 2 different video cards (GTX 660 Ti and GTX 460), I cannot use BOINC to run 2 POEM@Home OpenCL tasks at the same time.

My research with the Poem@Home developers indicates that there is an nVidia driver bug; please look at the following links:

From 2010, shows developers describing the problem that I’m seeing.
Blog post where another developer has the same problem.
GitHub code repository that has the code to easily see the problem.

Also, I’m not a CUDA developer, merely a distributed computer, but I ran a slightly-modified Windows version of the program (created by the POEM developers), to confirm that I too was getting the same failures as the other developers were. I’ve attached my results.
That version can be found here:

Could you kindly look into this issue, so that we can better use our different cards simultaneously to do OpenCL computing? This is currently preventing users from properly utilizing their cards. PS: I’ve also tested the (latest) 313.95 beta drivers, which still exhibit the problem.

I’m hoping you guys can fix this!!
And if there’s anything I can do to test a fix, I am available anytime.

Thanks in advance,
Jacob Klein

Does anybody know more information about this issue?
On nVidia Support, I used “Ask a Question” (Reference # 130128-000166) … but they were not helpful, and haven’t replied in 6 weeks!
I’ve tried to give as much information as I can.

How can we get this fixed?!?


Am I asking in the wrong place? Any assistance would be greatly appreciated.

Are there any nVidia developers that can reproduce this issue?

I don’t think so …
I have a system with 3 different nvidia GPUs, and an AMD CPU. this gives 2 different platforms, 2 different contexts, 4 command queues, and the kernel has to be loaded for each device in the right context on the right device.
The only problem I have is that the GPUs do not seem to be capable to perform many calculations: the CPU is completely stuck, and I get no output on any of the devices when I launch the kernel on all 4 of them.

I guess (but I’m not sure) you forget to call clReleaseKernel() somewhere. make sure everything is released after kernel execution

Did you run the smalltest that was provided? What output did you get from it?

I have reason to believe that it’s possible that the new beta v320.00 drivers may have solved this problem. I’ll try to do some additional testing to confirm.

The v320.00 drivers did solve the problem where I was getting error code CL_INVALID_KERNEL_DEFINITION, but… I still cannot run POEM@Home OpenCL tasks at the same time on both of my heterogeneous GPUs. I am continuing to investigate.