Dereferencing a 2D host array + memcpy to device

Hi there,

I am trying sth simple like

int array_h[5][5];

for(int i=0;i<5;i++){
    cudaMemcpy(array_d, array_h[i], 5*sizeof(int), cudaMemcpyHostToDevice);

The device array is a simple 1D array, how do I dereference array_h correctly in cudamemcpy ???


You have to allocate array_d on the gpu with cudamalloc. Also your loop is just overwriting the array_d.

I’ve noted this a few times, but most of the time it is more efficient to flatten any 2 or 3D arrays to 1D – see for example:


thanks for the replies. All the mallocs have been implemented already. I don’t want the 2D structure on the device, think about it as an iterrative process on the host where array represents the iterration and the array_h second dimension will contain different values for array_d, I am basically only updating the already existing array on the device. The host structure controlls the updates

You need to activate from driver the UVA. This makes a unique space for all pointers. When you use UVA you can dereference.
Something to start from

Unless you have some specific reason, you should keep everything is needed on the gpu. If some data is used very often and it’s in the host memory it will be slow. If the array is too large for the gpu memory but it is accessed often, you can copy chunks over and over an overlap with computation on gpu.