How to define a three-dimensional array? define a three-dimensional array on GPU

i have a question:
now i have define a three-dimensional array in CPU(use malloc funtion),i want to konw how to define a three-dimensional array in GPU(device),and how to copy data from cpu(host) to GPU(device)?

my code to define a three-dimensional array(353535) on cpu as follow:

int d1=35, d2=35, d3=35; ///32768
int delta=(int)malloc(d1 * sizeof(int**));
for(int ii=0;ii<d1;ii++)
delta[ii]=(int**)malloc(d2 * sizeof(int*));
for(int j=0;j<d2;j++)
for(int k=0;k<d3;k++)

anybody can help me?

It’s better to just allocate an array of 353535 (42875) floats and calculate the indices manually in your kernel by z3535+y*35+x. For CPU usage it might help to have pointers to each row (to save multiplication operations) like you do, but for GPU this isn’t really the case because arithmetic is fast and memory access is slow.

BUT how to allocate an array of 353535 (42875) floats and calculate the indices manually in my kernel by z3535+y*35+x.?can you give me a example?



you can allocate the memory needed at once calling malloc only 1 time;

be carfully with sizeof and the correct type, even your array has x dimensions the sizeof always targets to the pointed end type which is not a pointer on pinter on int resulting in sizeof(int**) but sizeoff(int)

the other questions i can still not answer because i am starting into CUDA;


Code : int delta=(int)malloc(d1 * d2 * d3 * sizeof(int));

you mean change 3d array to 1d array?

Sorry - thats the point - my answer ws correct for standard C-Programming - but CUDA is different;


My statement was not correct sorry,

found a link for initializeing 3-dimensional C/C++arrays :


but this was not your question

you asked for GPU 3 dimensional arrays.

Hi dadada,

In the CUDA programming guide Arrays are mentioned only to be on ore two dimensional

if you intend to calculate some cube ,
you have to process some twodimendional arrrays resulting in a cube.

Tron Memory
Device memory can be allocated either as linear memory or as CUDA arrays.
Linear memory exists on the device in a 32-bit address space, so separately allocated entities can reference one another via pointers, for example, in a binary tree.
CUDA arrays are opaque memory layouts optimized for texture fetching (see Section 4.3.4). They are one-dimensional or two-dimensional and composed of elements, each of which has 1, 2 or 4 components that may be signed or unsigned

What do you need to do in the 3D array? If you are clear of certain limitations, you can simulate a 3D array with a big 1D array using double stride.

I haven’t done this myself, only in 2D, but I think it’d look like this:

float bigArr[xSize * ySize * zSize];

3dIndex = xPos + yPos * xSize + zPos * xSize * ySize;

3dValue = bigArr[3dIndex];

If you imagine the 3D array as a cube, you first move horizontally by xPos, then you move yPos down in strides of xSize - the width of the cube - then you move “up” the cube, the Z-axis, in strides of xSize*ySize - the X-Y cross-sectional area of the cube.

It’s convoluted, but trust me, you’ll get used to it :)

I sincerely apologize for muddying the issue if the above code is wrong.

edit: DUH, I didn’t read the previous posts and thus doomed myself to foolish repetition. Didn’t mean disrespect to previous posters.

How about “device float ThreeD[y][z]” ???

we SOOOO wish lol.

I have to go with kristleifur about this one… I’m also using this notation. At first it was difficult but now that I’m used to it. It is very easy and quick.


how about cudaMalloc3D from cuda2.0?

i cant understand how to access when using this function. anyone help?

But there’s a bug in cudaMemcpy3D as pointed out by grabner in the following thread.