Has anyone got a WORKING example of 3D matrices in CUDA?

I have been searching through the forumns for so long now.

What I am trying to find is an example code in C which allocates 3D data using cudaMalloc3D() (NOT using cuda arrays) and transfers NML amount of linear data from the CPU to the GPU using cudaMemcopy3D() and then has a simple kernel that uses the 3D data (say for example, sets each element to its thread’s ID), and then copies it back to the linear CPU array and then the reusults are printed.

HAS ANYONE ACHIEVED THIS?

Thanks

The SDK example simpleTexture3D does use cudaMalloc3D(), cudaMemcpy3D(), etc. in file simpleTexture3D_kernel.cu, function initCuda().

There are quite a few idiosyncrasies worth studying there.

Hope this helps.

The SDK example simpleTexture3D does use cudaMalloc3D(), cudaMemcpy3D(), etc. in file simpleTexture3D_kernel.cu, function initCuda().

There are quite a few idiosyncrasies worth studying there.

Hope this helps.