 # Data streaming in 2D array, LBM

Hi, I have a problem with an unexplained problem, results in the array on the edges are broken (prints 0). Im not sure whats wrong. Variable ‘a’ is a array filled 0-1 from the host. In the program are a lot of time steps and the kernel is executed many times. Somebody can explain what is maybe wrong in kernel function?

My array

``````|--------|
| i |  n |   *2
|--------|
|  n |   *2 + 1
``````
``````__global__ void GPU_CalculateTemp(float *a, float *b) {

int i = blockIdx.x * blockDim.x + threadIdx.x;

//Patterns

//Main loop
if ((i > 0) && (i < N )) {
//Collision
float fl = a[i * 2];
float fr = a[i * 2 + 1];

float rho = fl + fr;
float feq = 0.5 * rho;

//Streaming on the left and right from i
b[(i - 1) * 2] = omega * feq + (1 - omega) *  fl;
b[(i + 1) * 2 + 1] = omega * feq + (1 - omega) * fr;
}

}
``````

Kernel

``````void compute(int rank, float **device_a, float **device_b) {
int threadsperblock = N;  // static size #define N 100
int blockspergrid = 1;
size_t size = N*sizeof(float);

cudaMemcpy(*device_a, *device_b, size, cudaMemcpyDeviceToDevice); //after this device_a is copied to host array and print
}
``````

My left side array, second and third positions are 0:

``````0.05	  0.00	  0.10	  0.00	  0.25	  0.11	  0.23	  0.14	  0.37
``````

size_t size seems undersized, given you’re accessing up to element

b[(i + 1) * 2 + 1] for i ranging from 1 through N-1

so the last element written to is b[2*N-1]

Are you sure the arrays are allocated (and copied) with size 2Nsizeof(float) ?

size_t size = N*sizeof(float); seems to suggest otherwise.

Christian

My dear Christian, this was very helpful.