Hi,
I am a newbie with CUDA (I’ve installed and done my first simple program today) and I would like to transform one my code to CUDA.
However, after readings tons of informations, I am a bit afraid with what is really possible with it …
Here is my problem
I have a list of data (Donnees, up to millions of points) with three coordinates x,y,and z (Donnees[i].x, Donnees[i].y, Donnees[i].z)
I want to calculate on a cubic matrix (i,j,k) the sum of cos(f_kxDonnees[i].x+
f_kyDonnees[i].y+f_kz*Donnees[i].z) where f_kx depends on i, f_ky depends on j, f_kz depends on k.
The maximum size of the matrix is 256256256.
Since the data is big, what can I do to optimise the C code for CUDA and what are the limits possible in term of threads, grid, and blocks and dimensions of all these strange concepts for me.
Here my C code.
for (i=0;i<=i_decx;i++)
{
f_kx= 2PI(f_kxmin + i*f_stepkx);
for (j=0;j<=i_decy;j++)
{
f_ky= 2*PI*(f_kymin + j*f_stepky);
for (k=0;k<=i_decz;k++)
{
f_kz= 2*PI*(f_kzmin + k*f_stepkz);
for (i2=0;i2<iNbAtom1;i2++)
{
f_k=f_kx*Donnees[i2].fPx+f_ky*Donnees[i2].fPy+f_kz*Donnees[i2].fPz;
fre[i][j][k]=fre[i][j][k]+cos (2*PI*f_k);
}
}
}
}
I don’t need the CUDA code but just want to evaluate what I should do or not do …
If anyone can answer me, thanks in advance