I am trying to write code to do simulation on GPU with CUDA.
Periodic boundary condition is imposed on the simulation box. So the following codes is to deal with coordinates:
x[i] += dx;
x[i] -= L*rintf(x[i]*invL);
where L and invL is the box length and inverse box length. With such implementation, sometimes I get x[i] is half of box length L2 = L/2, which means that:
When I copy coordinate data to the host, and calculate the cell index ( simulation box is divided into cells with cell length cellLength, and cell number in each direction ncell ):
ix = (int)( ( x[i] + L2 )/cellLength );
sometimes I get ix = ncell, and print x[i] is just L2.
so I changed the above code ix = (int)( ( x[i] + L2 )/cellLength ); to
ix = (int)( ( x[i] + L2 )/cellLength );
if ( ix == ncell ) ix = 0;
I used the above two lines in both CPU and GPU to calculate particle’s cell label. However, with the same coordinate data, I got different results. I am wondering the error in the code for the above periodic condition and cell index calculation, or is the precision problem on CPU and GPU, are they different?