ojaswa
March 28, 2009, 9:02am
1
In a device to host memory transfer (from a device cudaPitchedPtr ‘d_pptr’ to host mem ‘h_mem’), one needs to make a pitchedPtr out of h_mem. If the following is used to do this, is the resulting pitchedPtr compatible with widths that are not a multiple of 8?
copyParams.dstPtr = make_cudaPitchedPtr((void*)h_mem,width*sizeof(float), width, height);
My observation is that dstPtr.pitch = widthsizeof(float), while d_pptr.pitch = 8 (ceil(width/8))*sizeof(float)
Further, dstPtr.xsize = width, while d_pptr.pitch = width*sizeof(float)
Are these two pitchedPtrs compatible for a call to cudamemcpy3D() to succeed?
The question arises since I get some memcopy problems for volume dimensions that are non-powers of 2.
Thanks,
Oj
ojaswa
March 28, 2009, 11:30am
2
The answer is yes! cudamemcpy3D() knows the difference between the two.
-Oj
In a device to host memory transfer (from a device cudaPitchedPtr ‘d_pptr’ to host mem ‘h_mem’), one needs to make a pitchedPtr out of h_mem. If the following is used to do this, is the resulting pitchedPtr compatible with widths that are not a multiple of 8?
copyParams.dstPtr = make_cudaPitchedPtr((void*)h_mem,width*sizeof(float), width, height);
My observation is that dstPtr.pitch = widthsizeof(float), while d_pptr.pitch = 8 (ceil(width/8))*sizeof(float)
Further, dstPtr.xsize = width, while d_pptr.pitch = width*sizeof(float)
Are these two pitchedPtrs compatible for a call to cudamemcpy3D() to succeed?
The question arises since I get some memcopy problems for volume dimensions that are non-powers of 2.
Thanks,
Oj