I declared a global device variable like this:
__device__ float m_dev_minimum_global;
I initilize it like this:
float m_minimum_global = MAX_FLOAT; cudaMemcpyToSymbol(m_dev_minimum_global, &m_minimum_global, sizeof(float));
I then run a kernel, which makes use of the variable and writes new values to it.
Now I just want to copy the value inside the variable to a host variable from host code.
This is how I last tried to use it:
float *host_distance_gpu; host_distance_gpu = (float*)malloc(sizeof(float)); cudaMemcpyFromSymbol(host_distance_gpu, m_dev_minimum_global, sizeof(float));
I also tried around using a non-pointer float, but the host variable always ends up as 0.0000.
I know the device variable holds the correct value, because I currently use a kernel to just copy to a normal device pointer per cudaMemcpy.
Can someone explain to me why this doesn’t work?
Also, when debugging, the instruction can not be stepped into, but it returns cudaSucess.