Why cudaMemcpy2D cause "invalid pitch argument"?


I’ve a strage error. I use the “cudaMemcpy2D” function as follow :

cudaMemcpy2D(A, pA, B, pB, width_in_bytes, height, cudaMemcpyHostToDevice);

As I know that B is an host float*, I have pB=width_in_bytes=N*sizeof(float).

So, when I use an array ( B ) of width N=65536, there is no problem.

But when I use N=65537 (actually > 65536), I have an “invalid pitch argument” error.

So, I have pA = 131072 and pB = 262148 (65537*sizeof(float)).

The memory allocation doesn’t cause an error.

I there a problem with pitch>2^16?

Can I copy a 2D section larger than 2^16 float?

Or more simply, what is the problem?

Many thanks,


it looks like you are trying to copy 2 variables in one go to the gpu ??? (A & B)

I know exactely what is the problem. Actually, when you try to do a memcpy2D, you must specify the pitch of the source and the pitch of the destination. These pitches must be <262144. It means that if you consider an array of float (4 bytes), you cannot use an array larger than 65536.

Of course, you can first copy a part of your array in a smaller array on the CPU and then after use memcpy2D from the smaller array. But I think this is a strong limitation for GPGPU applications such as mine. I hope that this limit will increase with future new GPU.