I’m having trouble doing the parallelization on an array of numbers with CUDA.

So, for example if we have an array M containing numbers ( 1 , 2 , 3 , 4 , 5)

And If I were to remove the number 2 in the array and shift everything to the left,

the resulting array would be ( 1 , 3 , 4 , 5 , 5 )

where M[1] = M[2], M[2] = M[3] , M[3] = M[4]

And my question is how can we do this in parallel in cuda? Because when we parallel this there might be a race condition where the number 2 (M[1]) might not be the first one to act first, if M[2] were the first one to shift, the resulting array would become ( 1 , 4 , 4 , 5 , 5). Is there any method to handle this? I’m fairly new to cuda so I’m not sure what to do…

My current code is as follows:

`**global** void gpu_shiftSeam(int *MCEnergyMat, int *seam, int width, int height, int currRow)

{

int i = blockIdx.x * blockDim.x + threadIdx.x;

int j = blockIdx.y * blockDim.y + threadIdx.y;

```
int index = i+width*j;
if(i < width && j <height)
{
//shift values of -1 to the side of the image
if(MCEnergyMat[i+width*j] == -1) //check if its equal to -1, if so remove and start shifting
{
if(i+1 != width) //check boundary
MCEnergyMat[index] = MCEnergyMat[index+1];
}
if(seam[j] < i) //check if its past the "-1" shifting point, if so start shifting, else dont do anything
{
if(i+1 != width) //check boundary
MCEnergyMat[index] = MCEnergyMat[index+1];
}
```

}

}`

Where seam[i] contains the index I would like to remove in the array. and MCEnergyMat is just a 1D array converted from a 2d array… However, my code does not work…and I believe race condition is the problem.

Thanks!