cudaMemcpyToArray is deprecated

Robert_Crovella · July 31, 2019, 9:16pm

Suppose we have a cudaArray allocated with cudaMallocArray. It is inherently a 2D allocation, consisting of a width and height (in elements):

__host__ cudaError_t cudaMallocArray ( cudaArray_t* array, const cudaChannelFormatDesc* desc, size_t width, size_t height = 0, unsigned int  flags = 0 )

Let’s suppose our width is w and our height is h. Now let’s suppose we want to transfer data to this array. In the general case, we may wish to transfer data to a specific region within this array, the size of which is called extent. Lets say that the origin of that region is (x,y), the width of that region is sw, and the height of that region is sh. Like this:

To formulate a proper cudaMemcpy2DToArray operation to this destination region in the cudaArray dst from a 2D unpitched host memory allocation src, it would look like this:

cudaMemcpy2DToArray(dst, x, y, src, sw*sizeof(src[0]) , sw*sizeof(src[0]), sh, cudaMemcpyHostToDevice);

Regarding “2D unpitched host memory allocation”, that is effectively a linear allocation of memory. 2D refers to the idea that it logically represents a 2D region, consisting of rows and columns. However it may ordinarily be a 1D allocation, where the “rows” are adjacent to each other. In addtion, the assumption for the above case is that the source allocation effectively has a width of sw, i.e. the entire source allocation corresponds to a data set the size and shape of the green region above.

For the general case of a pitched source (host) allocation, the only difference would be choice of src pointer to point to the beginning of the source region, and modification of the host pitch parameter.

From the above description, it is evident that a pitch is involved when using this api as a replacement for cudaMemcpyToArray (which is deprecated).

Going back to the original question then, when refactoring code that is currently in a realization like this:

cudaMemcpyToArray(dst, 0, 0, &vector[0], nSamples*sizeof(float), cudaMemcpyHostToDevice)

it will be necessary to know the width associated with the cudaArray dst. The source line pitch parameter (as well as transfer column width) associated with the cudaMemcpy2DToArray operation must be consistent with (i.e. less than or equal to) the width of the cudaArray (we are considering both widths in elements for this comparison statement, although the widths associated with the cudaMemcpy2DToArray operation are expressed in bytes).

If the width of the cudaArray is not known by inspection of the code, it can be ascertained in-situ at runtime using the cudaArrayGetInfo api:

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1g373dacf191566b0bf5e5b807517b6bf9

Topic		Replies	Views
cudaMemcpyToArray CUDA Programming and Performance	4	3026	March 24, 2010
cudaMemcpy2DArraytoArray vs cudaMemcpy2DtoArray what is the difference? CUDA Programming and Performance	2	6286	March 26, 2009
newbie question: cudaMemcpy2DArrayToArray CUDA Programming and Performance	9	16271	June 26, 2008
cudaMemcpyToArray question CUDA Programming and Performance	0	4714	December 10, 2008
error programming guide? 4.5.2.3 Memory Mangement CUDA Programming and Performance	0	1914	June 18, 2007
cudaMemcpyToArray CUDA Programming and Performance	0	2338	March 26, 2010
Two questions about cudaMemcpyToArray CUDA Programming and Performance	0	4300	February 24, 2009
help cudaMemcpy2d Trying to modify a 2d array on cuda device CUDA Programming and Performance	8	5022	September 11, 2010
Copying to a 3D cuda array cudaMemcpyToArray returns cudaErrorInvalidValue CUDA Programming and Performance	4	17845	June 9, 2010
cudaMemcpy3dToArray CUDA Programming and Performance	4	16374	March 12, 2010

cudaMemcpyToArray is deprecated

Related topics