i have a question:
now i have define a three-dimensional array in CPU(use malloc funtion),i want to konw how to define a three-dimensional array in GPU(device),and how to copy data from cpu(host) to GPU(device)?
my code to define a three-dimensional array(353535) on cpu as follow:
It’s better to just allocate an array of 353535 (42875) floats and calculate the indices manually in your kernel by z3535+y*35+x. For CPU usage it might help to have pointers to each row (to save multiplication operations) like you do, but for GPU this isn’t really the case because arithmetic is fast and memory access is slow.
you can allocate the memory needed at once calling malloc only 1 time;
be carfully with sizeof and the correct type, even your array has x dimensions the sizeof always targets to the pointed end type which is not a pointer on pinter on int resulting in sizeof(int**) but sizeoff(int)
the other questions i can still not answer because i am starting into CUDA;
Tron777
Code : int delta=(int)malloc(d1 * d2 * d3 * sizeof(int));
In the CUDA programming guide Arrays are mentioned only to be on ore two dimensional
if you intend to calculate some cube ,
you have to process some twodimendional arrrays resulting in a cube.
Tron
4.5.1.2 Memory
Device memory can be allocated either as linear memory or as CUDA arrays.
Linear memory exists on the device in a 32-bit address space, so separately allocated entities can reference one another via pointers, for example, in a binary tree.
CUDA arrays are opaque memory layouts optimized for texture fetching (see Section 4.3.4). They are one-dimensional or two-dimensional and composed of elements, each of which has 1, 2 or 4 components that may be signed or unsigned
If you imagine the 3D array as a cube, you first move horizontally by xPos, then you move yPos down in strides of xSize - the width of the cube - then you move “up” the cube, the Z-axis, in strides of xSize*ySize - the X-Y cross-sectional area of the cube.
It’s convoluted, but trust me, you’ll get used to it :)
I sincerely apologize for muddying the issue if the above code is wrong.
edit: DUH, I didn’t read the previous posts and thus doomed myself to foolish repetition. Didn’t mean disrespect to previous posters.
I have to go with kristleifur about this one… I’m also using this notation. At first it was difficult but now that I’m used to it. It is very easy and quick.