How to handle multidimensional array having loop variables as dimensions inside parallelised kernel function

Hello Everyone,
Hope you all are doing well.
I have written some __ device__ code for parallelisation of my algorithm. Inside my kernel or device code, I have various nested for loops and inside those loops, there are some multidimensional arrays having those loop variables as dimensions.

The snippet can be seen below:

int natoms =147;
int h = blockIdx.y*blockDim.y+threadIdx.y;
if((h < 147){
for(ll=0;ll<lmax;ll++)
   {
    int ju = blockIdx.x*blockDim.x+threadIdx.x;
for(b_mm=0;b_mm<19;b_mm++)
    {
if(ju<natoms)
      {
    m_1 = abs(mm);

if (mm<0)
        {
        temp1 = cuCmul(q, make_cuDoubleComplex(-m_1,0));
        ex1[(h*147)+ju] = cuCmul(temp1, make_cuDoubleComplex(fophi1[(h*147)+ju],0));
        comp_1_w[(h*147)+ju] = my_complex_exp(ex1[(h*147)+ju]);
        }
else
        {
        temp2 = cuCmul(q, make_cuDoubleComplex(m_1,0));
        ex2[(h*147)+ju] = cuCmul(temp2, make_cuDoubleComplex(fophi1[(h*147)+ju],0));
        ex3[(h*147)+ju] = my_complex_exp (ex2[(h*147)+ju]);
		comp_1_w[(h*147)+ju] = cuCmul(ex3[(h*147)+ju], make_cuDoubleComplex(result, 0));
         //printf("comp_1_w values : %f %f\n",comp_1_w[(h*147)+ju].x,comp_1_w[(h*147)+ju].y);
        }
		mult1[(h*147)+ju] = cuCmul(comp_1_w[(h*147)+ju], make_cuDoubleComplex(cons1[(ll*10)+b_mm], 0));
		Lmmnn = Lmn_all[ju][ll][m_1];
		}
}
}
}

As can be seen from above code, cons1[(ll*10)+b_mm] depends on ll and b_mm for loops.
similarly, the statement below:
Lmmnn = Lmn_all[ju][ll][m_1];
where Lmn_all is a multidimensional array depending on loop variable.
I am not getting correct values of Lmmnn variable in last line. Please anyone help how to handle these multidimensional arrays inside for loops in a parallelized code.
Thank You.