Trasnferring 3D datasets from Host to Device and Back Please help!

I would really really appreciate it if somebody can take the pains to give a small example for transferring a 3D data set from host to device and back.

For example I have a 3d array in C of the type int***. How do I create 3D dataset on gpu and transfer from host to device??

This is really urgent.


The short answer is that there is no “easy” example. Your “3D” array is really a two dimension array of pointers, with each pointer holding the address on a single row of data. Allocating and copying data to the device in that form will require iterative cudaMalloc() and cudaMemcpy() calls, each allocating and copying a single row of data. At the end of it all, your device kernels will have to read though two levels of pointer indirection to get to your data (which is very slow) and none of the 3D api functions you have been asking about will work with data in that form anyway.

All arrays in the 2D and 3D CUDA memory access functions are really flat, 1D spaces which are padded for alignment and optimal access performance by the GPU memory controller. You would be much better served using a 1D array of size (xyz) and an addressing in 1D like data[i + j*x + k*x*y] in column major order (or the equivalent row major order) on both the host and device.