please give me some idea on how to deal with 2d or 3d array in CUDA.
during last several weeks, i was busy with getting to know C and CUDA and i’d say i tasted a little of it so far.
(still long way to go) i learned a lot reading posts here…
now i am stepping into a little deeper.
i have a code that deals with 2d and 3d matrices in C like a_h [i][j] or b_h [i][j][k] .
i don’t have time to change all 2d or 3d matrices into 1d. i have to deal with those.
now. i researched a little and i see some CUDA functions like cudaMemcpy2D, cudaMemcpy3D, cudaMalloc3D, etc.
well before getting into more detail on those, i have a general question.
can i put 2d or 3d array in the cuda kernel?
or, i still need to play with 1d array in the kernel even if my host is coded with 2d, 3d array.
that is, can i do something like the following?
global void MatAdd()
{
c[i][j] = a[i][j] + b[i][j]
}
or, it must be something like global void MatAdd()
{
c[i] = a[i] + b[i]
/* a and b are in 2d or 3d in host code */
}
i’d like to set a clear direction first and start from there…
please advice me on this. May thanks in advance.
please give me some idea on how to deal with 2d or 3d array in CUDA.
during last several weeks, i was busy with getting to know C and CUDA and i’d say i tasted a little of it so far.
(still long way to go) i learned a lot reading posts here…
now i am stepping into a little deeper.
i have a code that deals with 2d and 3d matrices in C like a_h [i][j] or b_h [i][j][k] .
i don’t have time to change all 2d or 3d matrices into 1d. i have to deal with those.
now. i researched a little and i see some CUDA functions like cudaMemcpy2D, cudaMemcpy3D, cudaMalloc3D, etc.
well before getting into more detail on those, i have a general question.
can i put 2d or 3d array in the cuda kernel?
or, i still need to play with 1d array in the kernel even if my host is coded with 2d, 3d array.
that is, can i do something like the following?
global void MatAdd()
{
c[i][j] = a[i][j] + b[i][j]
}
or, it must be something like global void MatAdd()
{
c[i] = a[i] + b[i]
/* a and b are in 2d or 3d in host code */
}
i’d like to set a clear direction first and start from there…
please advice me on this. May thanks in advance.
Unless you’re experienced with cuda, your matrix multiplication implementation would be very (factor of a hundred or so) inefficient. I find checking out matrix multiplication example in nvidia SDK to be way more educational than trying to implement it by oneself.
Unless you’re experienced with cuda, your matrix multiplication implementation would be very (factor of a hundred or so) inefficient. I find checking out matrix multiplication example in nvidia SDK to be way more educational than trying to implement it by oneself.
yea…i know. i am studying “program massively parallel processors” and “CUDA by example:An Intro. to General-Purpose GPU Programming” and i see your point. but, well. i have to do what i have to and thats why i am asking around… ha ha ha.
yea…i know. i am studying “program massively parallel processors” and “CUDA by example:An Intro. to General-Purpose GPU Programming” and i see your point. but, well. i have to do what i have to and thats why i am asking around… ha ha ha.