I have a function to calculate the dot product of 2 vectors. It seems to work, I can cudaMemcopy the data, operate on the data, download the data and verify. So I tried extending it to a function to calculate the magnitude of a vector (square root of a vector dotted with itself), which returns a scalar (float).
No matter what I do, I can’t seem to pass a scalar (float) as a parameter to a global function. I always get the error during link:
error C2664: ‘__device_stub__Z10CUDAmagwrkPfRf’ : cannot convert parameter 2 from ‘float’ to ‘float *’
I have the scalar (float) cudaMalloc’d, but I can’t pass by reference. Attempting to pass as a pointer fails at compile time.
I’m not sure why this results in a linker error, but internally a reference is really a pointer. So what you are doing wouldn’t work even if you could compile it. The device cannot write to a host pointer. If you want to read back values from the device, you need to allocate them with cudaMalloc and copy them back with cudaMemcpy. Alternatively, you can define a global device variable and copy back with cudaMemcpyFromSymbol, but global variables are evil so I wouldn’t recommend it.