I am trying to learn Cuda. I have the following code,
global void DivideByPivot(float *d_m, float pivot, int N)
int j = blockIdx.x * blockDim.x + threadIdx.x;
if (j < N) d_m[j] /= pivot;
I want to read back the value in d_m from the device. Do I have to use cudaMemcpy to do that?
How can I then assign d_m = 1.0 Do I have to do cudaMemcpy again?
Is there a simpler way to read and write a single value from and to the device memory?