__global__ function parameters mangled?

I’m trying to pass two floating point paramters to a cuda function (d_roverPos, and d_m01)

I did a cudaMalloc on them, and then whenever I get new data, I copy the data to the device using cudaMemcpy and pass the device pointers into the function. I’ve also tried just passing host pointers to the function, which yielded the same results I’m getting. Here’s the code:

extern "C"void cudaCalcFrame(float* roverPos, float* m01, float lambda, float* readBuff, float* writeBuff, int width) {

#ifdef __DEVICE_EMULATION__

  printf("roverPosition: (%.3f, %.3f, %3f)  Input: (%.3f, %.3f)\n", roverPos[0], roverPos[1], roverPos[2], m01[0], m01[1]);

#endif

 // Copy inputs to device

  cudaMemcpy((void**)d_roverPos, roverPos, 3*sizeof(float), cudaMemcpyHostToDevice);

  cudaMemcpy((void**)d_m01, m01, 2*sizeof(float), cudaMemcpyHostToDevice);

 d_cudaCalcFrame <<< *grid, *threads >>> (d_roverPos, d_m01, lambda, readBuff, writeBuff, width);

}      

The above print statement yields:

But in the actual cuda function it prints out as:

Which is clearly incorrect, does anyone know why my data’s getting mangled?

Where do you store the pointer you get back from cudaMalloc? In &d_roverPos? Because that is what you pass to cudaMemcpy currently. I think you store it in d_roverPos, so you need to pass the actual variable content (= the pointer value) to cudaMemcpy. That is

 float* d_roverPos;

  cudaMalloc(&d_roverPos,...);

  cudaMemcpy((void*)d_roverPos, ...);

should to the trick. The idea is that the pointer variable to cudaMalloc is a host variable, while its content is a device pointer value (=address).

Peter