clCreateProgramFromBinary() failing with CUDA 3.0 Toolkit


I have a library with uses clCreateProgramWithBinary() to instantiate a program from a binary object obtained from clGetProgramInfo( CL_PROGRAM_BINARY_INFO ).

This worked happily with previous releases but with 3.0, the clCPWB() call returns error -42 (INVALID_BINARY). The program binary retrieved with cGPI() is PTX code, as in previous releases, but it seems that the runtime no longer accepts PTX binary objects.

Is this a known issue? What type of binary object is now acceptable, and how can I obtain it?



Yeah, I’m having the same problem. It would be nice to be able to a. reuse compiled code on multiple identical devices, and b. work around the lack of thread safety in the compiler by offloading compilation to another process. Compilation time is currently the bottleneck in my application, so this is a big issue.

Note that the failure actually occurs on clCreateKernel, which then returns CL_INVALID_PROGRAM_EXECUTABLE.

I figured it out. You have to call clBuildProgram after you call clCreateProgramWithBinary. Unintuitive, but it works!