how to use cudaMemcpyToArray? invalid argument runtime error


I don’t know what I did wrong using cudaMemcpyToArray() function. See the simple code below:

[codebox]#include <stdio.h>

#include <stdlib.h>

cudaArray *d_u0array;

int main()


int nx=128, ny=128, nz=4;

int nxyz = nxnynz;

cudaError_t err;

float *h_u0 = (float ) malloc(nxyzsizeof(float));

cudaExtent extent = make_cudaExtent(nx, ny, nz);

cudaChannelFormatDesc channelDesc_f = cudaCreateChannelDesc();

err = cudaMalloc3DArray(&d_u0array, &channelDesc_f, extent);

if (err != cudaSuccess)

printf("error 1: %s\n", cudaGetErrorString(err));

err = cudaMemcpyToArray(d_u0array, 0, 0, h_u0, nxyz*sizeof(float), cudaMemcpyHostToDevice);

if (err != cudaSuccess)

printf("error 2: %s\n", cudaGetErrorString(err));


Compiling was fine; but at runtime, an “invalid argument” error was returned.

Can cudaMemcpyToArray() actually be used for copying 3D arrays?

More clues:

If nz is 1, then the runtime error is gone. The moment nz is > 1, the error appears. So cudaMemcpyToArray() does not work for 3D cudaArray? Is there some call that does?