I have a very simple kernel that I’m trying to compile (using my own modified version of clcc). This is the kernel:
__kernel void init(__global float * v, int n)
{
int i = get_local_id(0)
+ get_local_size(0) * get_group_id(0)
+ get_local_size(0) * get_global_size(0) * get_local_size(1) * get_group_id(1)
+ get_local_size(0) * get_global_size(0) * get_local_id(1)
;
if (i > n)
return;
if (i < 0)
return;
float w = 1.0 / n;
v[i] = i * w;
}
The first statement in the kernel computes the mapping of the work item and group into an array index. This statements seems to cause some problems with the compiler.
I compiled this using calls to clCreateProgramWithSource then clBuildProgram. However, clBuildProgram returned -42 (= CL_INVALID_BINARY).
The following rewritten version is accepted.
__kernel void init(__global float * v, int n)
{
int j = get_local_id(1);
int i = get_local_id(0)
+ get_local_size(0) * get_group_id(0)
+ get_local_size(0) * get_global_size(0) * get_local_size(1) * get_group_id(1)
// + get_local_size(0) * get_global_size(0) * get_local_id(1)
+ get_local_size(0) * get_global_size(0) * j
;
if (i > n)
return;
if (i < 0)
return;
float w = 1.0 / n;
v[i] = i * w;
}
This seems like compiler bug.
In addition, the return value CL_INVALID_BINARY a little weird because according to the OpenCL Specification “CL_INVALID_BINARY if program is created with clCreateWithProgramBinary and devices listed in device_list do not have a valid program binary loaded.”. This is the only way it can return CL_INVALID_BINARY. However, I never called clCreateWithProgramBinary. This seems like the OpenCL driver is not compliant with the spec.
Ken