CapJo
January 27, 2010, 2:03pm
1
I have here a piece of code and it does not behave as I assume.
I assume that the memory I copy back to the host should be all initialized with zero, but I also get non zero values.
Either I have false assumption, I did something wrong or cudaMemset3D ist broken.
System is Windows XP x64. CUDA 2.3.
include <stdio.h>
include <stdlib.h>
include <cuda_runtime.h>
main(int argc, char** argv)
{
cudaError_t status = cudaSuccess;
size_t bytesProcessed;
cudaExtent volume_dim = {0};
volume_dim.width = 396 * sizeof(unsigned char);
volume_dim.height = 5;
volume_dim.depth = 162;
cudaPitchedPtr d_volume = {0};
status = cudaMalloc3D (&d_volume, volume_dim);
if(status != cudaSuccess){fprintf(stderr, "%s\n", cudaGetErrorString(status));}
status = cudaMemset3D (d_volume, 21, volume_dim);
if(status != cudaSuccess){fprintf(stderr, "%s\n", cudaGetErrorString(status));}
cudaPos volPos = {0};
volPos.x = 0 * sizeof(unsigned char);
volPos.y = 0 * sizeof(unsigned char);
volPos.z = 0 * sizeof(unsigned char);
// allocate memory to copy initialized data from device
char* buffer = (char*) malloc (volume_dim.width*volume_dim.height*volume_dim.depth);
cudaPitchedPtr hostBufferPitch3D = {0};
hostBufferPitch3D.ptr = (void*)buffer;
hostBufferPitch3D.pitch = volume_dim.width*sizeof(unsigned char); /* memory extend per line x direction in bytes*/
hostBufferPitch3D.xsize = volume_dim.width; /* extend of data in x direction*/
hostBufferPitch3D.ysize = volume_dim.height; /* extend of data in y direction*/
// cudaMemcpy3D Device to Host
cudaMemcpy3DParms volCopyParms = {0};
volCopyParms.srcPos = volPos;
volCopyParms.srcPtr = d_volume;
volCopyParms.dstPos = volPos;
volCopyParms.dstPtr = hostBufferPitch3D;
volCopyParms.extent = volume_dim;
volCopyParms.kind = cudaMemcpyDeviceToHost;
status = cudaMemcpy3D(&volCopyParms);
if(status != cudaSuccess){fprintf(stderr, "%s\n", cudaGetErrorString(status));}
for(int i=0; i < volume_dim.width*volume_dim.height*volume_dim.depth; i++)
{
if(buffer[i] != 21)
{printf("Array element %d not initialzed!", i);
exit (3);}
}
}
CapJo
January 28, 2010, 12:07am
2
I have here a piece of code and it does not behave as I assume.
I assume that the memory I copy back to the host should be all initialized with zero, but I also get non zero values.
Either I have false assumption, I did something wrong or cudaMemset3D ist broken.
System is Windows XP x64. CUDA 2.3.
Can someone test and confirm this behavior / bug?
fcs
January 28, 2010, 8:50am
3
CapJo
January 28, 2010, 10:08am
4
Thank you fcs for your hint.
The problem is however not the 3D memory usage itself, but the cudaMemset3D function. In my opinion it does not behave as it should. At least on Windows XP x64.
I used cudaMemset((void*)d_volume, 0, pitch_x * dim_y, dim_z) instead of cudaMemset3D and now it works all correctly.
Can someone test the code above?
CapJo
January 28, 2010, 10:39am
5
The problem is not the 3D memory usage itself, but the cudaMemset3D function. In my opinion it does not behave as it should. At least on Windows XP x64.
I used cudaMemset((void*)d_volume, 0, pitch_x * dim_y, dim_z) instead of cudaMemset3D and now it works all correctly.
Can someone test the code above?
It tried the same lines of code again and now it does not show the same problem as yesterday …
Yesterday I got uninitialized elements and today there are gone … very strange.
[Update]
Yet again the same thing … uninitialzed data … it seems it appears randomly.
The choice of 0 was bad, using something else 21 for example shows the error in every run.
I modified the code above.
CapJo
March 18, 2010, 12:07am
6
I got a response from NVIDIA and this bug will be fixed in the next release of CUDA, probably 3.0 but I’m not sure what they mean with next release.
downey
April 12, 2010, 11:45pm
8
Can we get a response from NVidia on this, I don’t believe it is fixed in 3.0 but I could be wrong? I too am a victim of this bug.
CapJo
April 13, 2010, 8:17am
9
This bug is probably not fixed yet, I filled in a bug report long ago in the registered developers zone. The latest response was
There was no update of this bug yet, so I assume this fix is not available in CUDA 3.0.
If cudaMemset3D is the problem, You could consider writing that function as a kernel on your own? – If its really biting, you should bite it back…
downey
April 14, 2010, 4:13pm
11
The work around is to use either cudaMemset or cudaMemset2D. I notice that cudaMemset is quite a bit faster than cudaMemset2D.
The main issue with this bug isn’t that it would be hard to write a replacement but that you should be able to rely on basic functions. I assumed incorrectly that memset should be very stable and so I spent most of my time looking elsewhere for the problem. Now that I know it is a bug with cudaMemset3D it didn’t take more than 2 minutes to fix the issue which took a while to figure out.
The work around is to use either cudaMemset or cudaMemset2D. I notice that cudaMemset is quite a bit faster than cudaMemset2D.
The main issue with this bug isn’t that it would be hard to write a replacement but that you should be able to rely on basic functions. I assumed incorrectly that memset should be very stable and so I spent most of my time looking elsewhere for the problem. Now that I know it is a bug with cudaMemset3D it didn’t take more than 2 minutes to fix the issue which took a while to figure out.
Yes, Truly a better workaround… And Yes again, It must have been VERY difficult to locate the bug…
The work around is to use either cudaMemset or cudaMemset2D. I notice that cudaMemset is quite a bit faster than cudaMemset2D.
The main issue with this bug isn’t that it would be hard to write a replacement but that you should be able to rely on basic functions. I assumed incorrectly that memset should be very stable and so I spent most of my time looking elsewhere for the problem. Now that I know it is a bug with cudaMemset3D it didn’t take more than 2 minutes to fix the issue which took a while to figure out.
Yes, Truly a better workaround… And Yes again, It must have been VERY difficult to locate the bug…