Hello, I’m new to the forum and CUDA as well. I’m trying to solve the 3D diffusion problem in one of my projects. Below is my kernel. I tested it with a very simple 3 x 3 x 3 cube with the [1][1][1] element = 1.0. Everything else is 0.0. After 1 iteration, the output looks like this:
0
0
4.21299e+06
0
0.1
5.55869e+06
0
0
6.25582e+06
0
0.1
6.87205e+06
0.1
0.5
7.46127e+06
0
0.1
8.06317e+06
0
0
1.30519e+07
0
0
1.49227e+07
0
0
2.03995e+07
while the correct output should look like:
0
0
0
0
0.1
0
0
0
0
0
0.1
0
0.1
0.4
0.1
0
0.1
0
0
0
0
0
0.1
0
0
0
0
Also, here is how I launched the kernel: diffusion<<<2, BLOCK_SIZE>>>(d_matrixIn, d_matrixOut, numRows, numCols, layers, ce, cw, cn, cs, ct, cb, cc);
Please help! I’d really appreciate it.
global void diffusion(float *f1, float * f2, int nx, int ny, int nz,
float ce, float cw, float cn, float cs, float ct, float cb, float cc)
{
int x = blockDim.x * blockIdx.x + threadIdx.x;
int y = blockDim.y * blockIdx.y + threadIdx.y;
int c = x + y * nx;
int xy = nx * ny;
// int j,jz,je,jn,jb,jw,js,jt;
for (int k = 0; k< nz; ++k)
{
int w = (x == 0) ? c : c - 1;
int e = (x == nx - 1) ? c : c + 1;
int s = (y == 0) ? c : c - nx;
int n = (y == ny - 1) ? c : c + nx;
int b = (k == 0) ? c : c - xy;
int t = (k == nz - 1) ? c : c + xy;
f2[c] = cc * f1[c] + cw * f1[w] + ce * f1[e] + cs * f1[s] + cn * f1[n] + cb * f1[b] + ct * f1[t];
c += xy;
}
}