Building Device Binarys

is there a way to directly build opencl binarys for an device? My programs became very complex and so the opencl compiler always needs several seconds to compile. This is very annoying, especially in debug mode . Is there an external compiler or something like that?


you can write your own “compiler” by compiling opencl programs in the usual way with clBuildProgram. then you can read out the binaries using clGetProgramInfo with the parameter CL_PROGRAM_BINARIES. you can store the returned string in a file and reuse it the next run.

Thx! :)

I’ll try it.

So I’ve tried the following:

m_deviceContext = clCreateContextFromType(0, CL_DEVICE_TYPE_GPU, NULL, NULL, &ciErr1);

clProgram = clCreateProgramWithSource(m_deviceContext, 1, (const char **)&cProgramFile, &szKernelLength, &ciErr1);

ciErr1 |= clBuildProgram(clProgram, 0, NULL, NULL, NULL, NULL);

size_t binarySize;

void *programmBinary;

ciErr1 = clGetProgramInfo(clProgram,CL_PROGRAM_BINARY_SIZES,NULL,NULL,&binarySize);

programBinary = (void *)malloc(sizeof(char*) *binarySize);

ciErr1 = clGetProgramInfo(clProgram,CL_PROGRAM_BINARIES,binarySize,programBinary,N


Everytime I try to execute, I get a memory access violation at the last clGetProgramInfo.

Where is my mistake?

I couldn’t find any example for clGetProgramInfo, so I tried on myself.

sorry, i dont have much time to reply. here’s how i’m doing it:

[codebox] void printBinaries()


                    cl_uint program_num_devices;

                    CL_CHECK_ERROR(clGetProgramInfo(        program,






if (program_num_devices == 0)


                            std::cerr << "no valid binary was found" << std::endl;



size_t binaries_sizes[program_num_devices];

CL_CHECK_ERROR(clGetProgramInfo( program,






char *binaries = new char[program_num_devices];

for (size_t i = 0; i < program_num_devices; i++)

                            binaries[i] = new char[binaries_sizes[i]+1];

CL_CHECK_ERROR(clGetProgramInfo(program, CL_PROGRAM_BINARIES, program_num_devices*sizeof(size_t), binaries, NULL));

for (size_t i = 0; i < program_num_devices; i++)


                            binaries[i][binaries_sizes[i]] = '\0';

                            std::cout << "Program " << i << ":" << std::endl;

                            std::cout << binaries[i];


for (size_t i = 0; i < program_num_devices; i++)

                            delete [] binaries[i];

delete binaries;



Thx a lot. Now it works! Thank you very much! :)

Is there a Emulation Debug mode avaliable for NVIDIA’s OpenCL (like with the CUDA SDK)?

No there is not (in this release anyway).

(I guess this thread could be an appropriate place to ask this question:)

How to check the number of registers used by an OpenCL kernel? With CUDA, this could be read from .cubin file; I tried with saving compiled OpenCL kernels into a .ptx file (using clGetProgramInfo()) to get the compiled PTX code), and then running “nvcc -cubin …” over this file but, as already mentioned on this forum, PTX generated by OpenCL SDK is version 1.5, while nvcc from CUDA 2.3 is supporting PTX 1.4 only…