pretending writing my own shader

Hy,

I have written an application on CPU that uses a shader that I have developed for some specific image processing. It works fine on CPU, then I have got some experience with CUDA and I feel ready to move this “shader” code on GPU. Then comes the usual question: how to make sure that the 3D point connected to each pixel is the nearest point ? I do it this way:

+I have a grid made of triangles, I compute the image coordinates of each vortex then I can compute the 3D coordinates for each pixel in the projected triangle.

+I also compute the distance D to the principal point (optical center).

+I keep track of the distance Dm for each pixel . So if the distance D is lower than Dm, the new 3D point may be viewed and I have to update Dm with D.

+The problem with CUDA is that several threads may attempt to update Dm at the same time. So I use a simple strategy which is to update Dm, read it again to make sure that the distance is ok. Some other thread may carry out the same stuff at the same time, so I just make sure that Dm is lower or equal than D.

+But things seems to be more complicated: Dm may look updated and at the same time it is updated by another thread so the final update distance Dm may be greater than expected.

+My solution is to check several times that Dm is lower or equal than D:

Dm=D1[ni];

if(Dm > D) {

  D1[ni]=D;

  it=0;

  while(1) {

    Dm=D1[ni];

if(Dm <= D) {

      it++;

      if(it == 3) break;	

    }

    else {

      it=0;

      D1[ni]=D;

    }

  }

}

it==3 is ok, 2 also. That works, but I am not happy with that.

Does someone have a rigourous solution ?

Yves

Hy,

I have written an application on CPU that uses a shader that I have developed for some specific image processing. It works fine on CPU, then I have got some experience with CUDA and I feel ready to move this “shader” code on GPU. Then comes the usual question: how to make sure that the 3D point connected to each pixel is the nearest point ? I do it this way:

+I have a grid made of triangles, I compute the image coordinates of each vortex then I can compute the 3D coordinates for each pixel in the projected triangle.

+I also compute the distance D to the principal point (optical center).

+I keep track of the distance Dm for each pixel . So if the distance D is lower than Dm, the new 3D point may be viewed and I have to update Dm with D.

+The problem with CUDA is that several threads may attempt to update Dm at the same time. So I use a simple strategy which is to update Dm, read it again to make sure that the distance is ok. Some other thread may carry out the same stuff at the same time, so I just make sure that Dm is lower or equal than D.

+But things seems to be more complicated: Dm may look updated and at the same time it is updated by another thread so the final update distance Dm may be greater than expected.

+My solution is to check several times that Dm is lower or equal than D:

Dm=D1[ni];

if(Dm > D) {

  D1[ni]=D;

  it=0;

  while(1) {

    Dm=D1[ni];

if(Dm <= D) {

      it++;

      if(it == 3) break;	

    }

    else {

      it=0;

      D1[ni]=D;

    }

  }

}

it==3 is ok, 2 also. That works, but I am not happy with that.

Does someone have a rigourous solution ?

Yves