[SOLVED] Copy data from 1D to 3D with cudaMemcpy3D

Hi,

It is possible to copy data from 1D array to 3D array with cudaMemcpy3D function?

I have a 1D array in a kernel that store the data information of a 3D matrix. Now I need to pass this data from device to host but I need a 3D array.

It is possible?

Thank you.

The principal purpose of cudaMemcpy2D and cudaMemcpy3D functions is to provide for the copying of data to or from pitched allocations.

Since you say “1D array in a kernel” I am assuming that is not a pitched allocation on the device.

I am also assuming the data representation you have or want on the host is not a pitched allocation (as that would be quite unusual).

In that case, you do not need anything other than cudaMemcpy, and it is possible for cudaMemcpy to be used to copy from a linear (non-pitched) allocation on the device to multiply-subscripted storage on the host, assuming you allocate correctly/carefully for your host storage – it must be built on top of a contiguous underlying allocation.

A worked example of 1D device data and 2D (i.e. doubly-subscripted) host data is here:

http://stackoverflow.com/questions/24659276/cuda-allocating-1d-device-memory-to-copy-2d-pointer-to-pointer-host-array-to-an/24660223#24660223

The extension of this to 3D (i.e. triply-subscripted) host data should be straightforward, if perhaps tedious.

Hi,

Thank you txbob, it was what I needed. I will try it. I think that it will work fine.

I have a code like this:

__global__ myKernel(double *eel)
{
    int i = blockDim.x * blockIdx.x + threadIdx.x;
    int j = blockDim.y * blockIdx.y + threadIdx.y;
    int k = blockDim.z * blockIdx.z + threadIdx.z;

    int idx = (((i * WIDTH) + j) * DEEP) + k; 

    //Do some operations and save result in 'data' variable
    ...
    double data = ...;
    eel[idx] = data;
}

int main(()
{
    ...
    size_t size = HIGHT * WIDTH * DEEP * sizeof(double);

    //Allocate device memory
    double *d_eel;    
    cudaMalloc(&d_eel, size);
    cudaMemset(d_eel, 0, size);

    //Execute kernel
    myKernel<<<gridSize, blockSize>>>(d_eel);

    //Copy data from 'd_eel' to 'grid'. grid is a (double***) array of size HIGHT * WIDTH * DEEP
    cudaMemcpy(grid[0][0], d_eel, size,cudaMemcpyDeviceToHost);

    //Free device memory
    cudaFree(d_eel);

    ...
}

I have one question. In line 29 of the code I put a host pointer to ‘grid[0][0]’, is it correct? I think so but it is to be safe.

Thank you!

Hi,

I answer my question: yes, it is correct.

I just tried it and all works fine.

Thank you again!