Interoperatability, 3D texture

Hi all,

Could someone teach me how to manipulate the cudaArray pointer to a 3D texture, e.g. to copy this data from device to host?

Below is the guide I followed to get the cudaArray pointer to a 2D texture defined in Opengl, and it worked:

The main steps are:

struct cudaGraphicsResource* pCudaResourceTex2D;
cudaGraphicsGLRegisterImage(&pCudaResourceTex2D, myOpenglTex2D, GL_TEXTURE_2D, cudaGraphicsRegisterFlagsNone);
cudaGraphicsMapResources(1, &pCudaResourceTex2D, 0);
cudaArray* pMyCudaArray2D;
cudaGraphicsSubResourceGetMappedArray(&pMyCudaArray2D, pCudaResourceTex2D, 0, 0);

Then I can use

cudaMemcpyFromArray((void*)pHost,pMyCudaArray2D,0,0,nCounts,cudaMemcpyDeviceToHost)

to transfer this 2D texture data to host, here pHost is the memory address in host, and nCounts = width x height x 4 x sizeof(uint16_t), 4 is the channel number.

However, when I applied this approach to the 3D texture, just updated the nCounts with multiplying the depth, I got the failed return 1 by cudaMemcpyFromArray(). So what is the right way to deal with this? Is there anything special when Opengl storing the 3D texture data?

Btw, how does Opengl store the texture data? Will it convert the uint16_t input into the float (see reference below)? If so, why did I succeeded when I use sizeof(uint16_t) to compute the total bytes?

Thanks a lot for any help and suggestion!

best regards