I’m completely new to CUDA and I’m trying to learn it at the moment, but I’m having some difficulties and I hope someone out there can point me in the right direction.
I’m trying to run the following code:
[codebox]unsigned char d_pixelDate = NULL;
unsigned char h_pixelDate[100][100];
//Allocate required memory on device
size_t *pitch = NULL;
cudaMallocPitch((void **)d_pixelDate, pitch, 100, 100);
//copy pixel data to the device
cudaMemcpy2D((void *)d_pixelDate, *pitch, h_pixelDate, 100, 100, 100, cudaMemcpyHostToDevice);[/codebox]
on the cudaMallocPitch line, I’m getting an error which, when I translate from German to English should be something like: “unauthorised access while writing to address 0x00000000” .
I’d be very grateful if anybody can point me to why I’m getting this error in this piece of code. Thanks.
Now, after modifying the code a little bit, I’m facing some other problem I can’t understand… Here is the code - please notice the enlarged characters colored red, that’s the interesting part of the code:
unsigned char **device_pixelData = NULL;
unsigned char **host_pixelData = NULL;
unsigned char *buffer;
int height = 10;
int width = 5;
host_pixelData = new unsigned char *[height];
for (int a=0;a<height; a++)
host_pixelData[a]=new unsigned char [width];
buffer= new unsigned char[width];
for(int i = 0;i<width;i++) buffer[i] = i;
for (int row=(height-1);row>=0;row–)
for (int line=0;line<width;line++)
host_pixelData[row][line]= buffer[line];
cudaError_t errorMessage1, errorMessage2;
//Allocate required memory on device
size_t pitch = 0;
size_t spitch = width*sizeof(unsigned char);
errorMessage1 = cudaMallocPitch((void **)&device_pixelData, &pitch, spitch, height);
//copy pixel data to the device
errorMessage2 = cudaMemcpy2D(device_pixelData, pitch, host_pixelData, spitch, width, height, cudaMemcpyHostToDevice);[/codebox]
I noticed that the success or failure of the code depends on the values of height and width, and I can’t understand why. Below are some examples of height and width values I tried and the errorMessages.
------> errorMessage1 = cudaSuccess
trying to execute the line with errorMessage2 leads to an access violation while trying to read from position 0x00361000
------> errorMessage1 = cudaSuccess
trying to execute the line with errorMessage2 leads to an access violation while trying to read from position 0x00361010
I can’t understand why certain configurations of height and width lead to access violations. Can anybody help point me to the source of the problem? Thanks.
You can’t use CUDA memcpy2D() to copy the kind of array of pointers you have on the host side to the device. “Pitched” memory on the device is just linear, 1D memory which has been padded and aligned for optimal performance on the device.
thanks for your reply. I still don’t understand why I can’t do that, please can you be a little more explicit as to why not? I was under the impression that the pitch parameter would take care of the alignment issues? And why does it work with some configurations of height and width, and doesn’t for others? Also, what kind of arrays can I then copy from host to device using cudaMemcpy2D()?
The basic problem is that your storage host_pixelData is a one dimensional array of pointers. This is not the same as a statically declared two dimensional array, nor is it the same as a one dimensional statically declared or dynamically allocated array of the same size. Consider the following code:
You might imagine that One, Two and Three are functionally equivalent to one another, but they are not. One and Two are interchangeable. Three is not. I suggest you study the above code until you understand why it doesn’t work as you might expect it to.