I am implementing a Lua wrapper for a kind of vector object that will reside purely into device memory, ie you can apply a sequence of kernels to this vector without having to copy the data back and forth between device and host.
The memory is initialized like this :
// tensor->storage is a float* cutilSafeCall(cudaMalloc((void**)&(tensor->storage), totalSize));
Then, I can assign a single value to an element of the vector :
float value = 1.2; long index = 0; cutilSafeCall(cudaMemcpy(tensor->storage + index, &value, sizeof(float), cudaMemcpyHostToDevice));
and I retrieve the value to print it like that :
float value = 0; long index = 0; cutilSafeCall(cudaMemcpy(&value, tensor->storage + index, sizeof(float), cudaMemcpyDeviceToHost)); printf("value = %f\n", value);
But if I run that, I get the following when printing the retrieved value :
value = 1.2000000476837
However, if I assign an integer value, like 1, the retrieved value is correct. Is this some kind of weird rounding problem, or am I doing something wrong ? I am using CUDA 3.0 beta 1, on Windows 7 x64 with Visual Studio 2008 and a GTX 275 GPU.
Thanks in advance !