A very basic CUDA question

Hi all,

If i do this in my kernel, I get nothing.

//pos is the array I am passing in from the host application to the kernel

//F is a constant value

//float4 x_i = pos[index];

//float4 F   = make_float4(0,-0.0098,0,0);

float4 newVal = x_i+F; 

pos[index] = newVal;

If I change it to this,

float4 newVal = make_float4(x_i.x+F.x, x_i.y+F.y, x_i.z+F.z, 1.0f); 

pos[index] = newVal;

I get the correct output. Could anyone tell me why the first version does not work while the second one does?



Can you post a complete kernel that shows the same symptom? Is a single line of source code does not seem to behave the way it is supposed to, it is most often because of it’s context.

Sure here it is.

__global__ void kernel(float4* pos, unsigned int width, unsigned int height, float4 gravity)


    unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;

    unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;

    unsigned int index = y*width+x;

    float4 x_i = pos[index];

    float4 F   = gravity;

    float4 newVal = x_i+F;                                                 //this doesn't work

    //float4 newVal = make_float4(x_i.x+F.x, x_i.y+F.y, x_i.z+F.z, 1.0f);  //this works

    pos[index] = newVal;


Ok i think i know why i am getting the wrong output. The thing is the last coordinate (w) that I pass in the position is not 1 (it contains the mass) thus when it is added to Force F the sum is larger than 1. Thus it is working however the projection is occuring at some other plane and not at the z=1 plane. The resolution is to simply set either the position.w=1 or the last coordinate of the sum to 1. Thanks for the input.