I have been banging my head against the wall trying to copy 2D matrices to the GPU and back to the host. I am trying to use the cudaMallocPitch() method, but it simply doesn’t work. I mean, it never compiles.
[codebox]///// Inside Kernel.cu /////
int main()
{
float* devPtr;
int pitch;
cudaMallocPitch((void**)&devPtr, &pitch, 3 * sizeof(float), 3);
return 0;
}[/codebox]
This code will not compile. This the most stripped down usage of cudaMallocPitch() I have even seen. Upon trying to compile, I get the error message, “error: no instance of overloaded function “cudaMallocPitch” matches the argument list”, which means that I’m sending in the wrong set of parameters than cudaMallocPitch() requires; however, that code I just posted is identical to what you see in the Programming Guide 3.0, page 20.
What gives?
EDIT: And when I changed “int pitch” to “size_t pitch”, it worked fine. I hate not being told the right thing to do. Oh well!