Hi all,
Quite simply - The following function doesn’t work, due to the fact the formal parameter list is somehow corrupt… When looking at the resulting output of dst, the values do not change at all during the function call (eg: memset-ing dst to 32 before the function call, will result in dst still containing values of 32 afterwards).
The odd thing is - if I put the ‘dst’ parameter at the start of the parameter list - it works fine, indicating that somehow the corruption starts mid-way through the list… but as you can see, I’m setting the parameters correctly (as far as I’m aware).
I’ve double and triple checked that all the values I’m passing into ‘cuSetParamX(a, b, value)’ are correct, so I’m really not sure what’s going on here… :unsure: The only thing I can think of is I’ve somehow mis-understood the size of uint4/float4/pointers - and must have magically got it write with all my other CUDa functions (inlikely…).
Sorry to dump a whole bunch of code… but without showing how I pass the parameters, this thread is somewhat useless.
This is using CUDA 2.0 on Windows XP and Vista (same results on both).
CUDA kernel:
[codebox]extern “C” global void function_name(uint4 img_data, uint4 src_roi, uint2 dst_pos, float4 value, unsigned char *src, unsigned char *dst)
{
const unsigned int global_tid = __umul24(blockIdx.x, blockDim.x) + threadIdx.x;
dst[global_tid] = 128;
}[/codebox]
Driver API Code (for setting parameters/param size):
[codebox]size_t offset(0);
// img_data
cuParamSeti(kernel, offset, …); offset += sizeof(int);
cuParamSeti(kernel, offset, …); offset += sizeof(int);
cuParamSeti(kernel, offset, …); offset += sizeof(int);
cuParamSeti(kernel, offset, …); offset += sizeof(int);
// src_roi
cuParamSeti(kernel, offset, …); offset += sizeof(int);
cuParamSeti(kernel, offset, …); offset += sizeof(int);
cuParamSeti(kernel, offset, …); offset += sizeof(int);
cuParamSeti(kernel, offset, …); offset += sizeof(int);
// dst_pos
cuParamSeti(kernel, offset, …); offset += sizeof(int);
cuParamSeti(kernel, offset, …); offset += sizeof(int);
// value
cuParamSetf(kernel, offset, …); offset += sizeof(float);
cuParamSetf(kernel, offset, …); offset += sizeof(float);
cuParamSetf(kernel, offset, …); offset += sizeof(float);
cuParamSetf(kernel, offset, …); offset += sizeof(float);
// src
cuParamSeti(kernel, offset, …); offset += __alignof(void*);
// dst
cuParamSeti(kernel, offset, …); offset += __alignof(void*);
cuParamSetSize(kernel, offset);[/codebox]
I should probably note, in the actual code I’m using - all of my CUDA functions are wrapped up in a macro that checks for errors - and nothing returns an error at any stage of my code.
Thanks in advance for any help,
Cheers.