Hi!
I use OpenCL with success in some applications on my system (Ubuntu 10.04 32bit, GeForce 8800 GT with Nvidia Driver 260.19.12, CUDA-Toolkit 3.2.9, OpenCL 1.0).
But when I try to pass an 8-vector as argument to a kernel (which is allowed according to the specification), then the program does not compile. Here is a very simple code snippet, just try to load it, you even do not have to set up any program parameters at this stage.
__kernel void testkernel(__global int* output, int8 params) {
output[0] = params.s5;
}
I get the following error output:
Program compilation notice. Returned status: 'CL_BUILD_ERROR',
ptxas application ptx input, line 20; error : Argument 1 of instuction 'add': must be register
ptxas fatal : Ptx assembly aborted due to errors
error : Ptx compilation failed: gpu='sm_11', device code='cuModuleLoadDataEx_17'
: Considering profile 'compute_11' for gpu='sm_11' in 'cuModuleLoadDataEx_17'
: Retrieving binary for 'cuModuleLoadDataEx_17', for gpu='sm_11', usage mode=' '
: Considering profile 'compute_11' for gpu='sm_11' in 'cuModuleLoadDataEx_17'
: Control flags for 'cuModuleLoadDataEx_17' disable search path
: Ptx binary found for 'cuModuleLoadDataEx_17', architecture='compute_11'
: Ptx compilation for 'cuModuleLoadDataEx_17', for gpu='sm_11', ocg options=' '
For some strange reason the compile succeeds when using [font=“Courier New”]params.s0[/font] … [font=“Courier New”]params.s4[/font] instead, but not for [font=“Courier New”]params.s5[/font] … [font=“Courier New”]params.s7[/font].
Same for the other vector types [font=“Courier New”]float8[/font], [font=“Courier New”]byte8[/font], etc.
It seems, that there is some very basic problem with accessing the last 3 components of any 8-vector! Can anybody confirm that?
[edit: reworked after the first answer (thanks to @jcpalmer) to make things more precise]