my OpenCL program fails to build with return value CL_INVALID_BINARY with error
ptxas application ptx input, line 36; fatal : Parsing error near '-': syntax error
ptxas fatal : Ptx assembly aborted due to errors
error : Ptx compilation failed: gpu='sm_20', device code='cuModuleLoadDataEx_6'
: Considering profile 'compute_20' for gpu='sm_20' in 'cuModuleLoadDataEx_6'
: Retrieving binary for 'cuModuleLoadDataEx_6', for gpu='sm_20', usage mode=' '
: Considering profile 'compute_20' for gpu='sm_20' in 'cuModuleLoadDataEx_6'
: Control flags for 'cuModuleLoadDataEx_6' disable search path
: Ptx binary found for 'cuModuleLoadDataEx_6', architecture='compute_20'
: Ptx compilation for 'cuModuleLoadDataEx_6', for gpu='sm_20', ocg options=' '
where the program source looks like
typedef __read_only image2d_t i2;
__kernel void shade_kernel(i2 texture, __global float4 * out)
{
int2 coords=(get_global_id(0),get_global_id(1));
const sampler_t texsampler = CLK_NORMALIZED_COORDS_FALSE|CLK_ADDRESS_REPEAT|CLK_FILTER_LI
NEAR;
*out = read_imagef(texture,texsampler,coords);
}
if I change the kernel to
typedef __read_only image2d_t i2;
__kernel void shade_kernel(i2 texture, __global float4 * out, sampler_t texsampler)
{
int2 coords=(get_global_id(0),get_global_id(1));
*out = read_imagef(texture,texsampler,coords);
}
the error message becomes
ptxas application ptx input, line 36; error : Unknown symbol 'shade_kernel_param_32666'
ptxas fatal : Ptx assembly aborted due to errors
error : Ptx compilation failed: gpu='sm_20', device code='cuModuleLoadDataEx_6'
: Considering profile 'compute_20' for gpu='sm_20' in 'cuModuleLoadDataEx_6'
: Retrieving binary for 'cuModuleLoadDataEx_6', for gpu='sm_20', usage mode=' '
: Considering profile 'compute_20' for gpu='sm_20' in 'cuModuleLoadDataEx_6'
: Control flags for 'cuModuleLoadDataEx_6' disable search path
: Ptx binary found for 'cuModuleLoadDataEx_6', architecture='compute_20'
: Ptx compilation for 'cuModuleLoadDataEx_6', for gpu='sm_20', ocg options=' '
with the number in shade_kernel_param_ being different on every run
in both these cases, this can be solved by substituting “i2” with “__read_only image2d_t”, but that is not possible in cases where I have “__global i2 * textures” as a kernel parameter (which is where I encounterd this bug)
driver version is 258.19 devrelease on Linux, device is GTX480