atomicMax for float

I have a heavily contended float that I want to ensure doesn’t screw up.

long story short, I have a 3D block executing where the z dimension corresponds to the specific row, the x dimension corresponds to row elements, and lastly the y direction shares each element. I want to keep a running max variable (float) in each element but I’m concerned about race conditions.

any smart way to implement this. i’m compute capability 1.1 on quadro fx 1700

uint ri = __umul24(blockIdx.x, blockDim.x) + threadIdx.x;

  uint di = __umul24(blockIdx.y, blockDim.y) + threadIdx.y;

  uint z = __umul24(blockIdx.z, blockDim.z) + threadIdx.z + z_off;

  uint dim = __umul24(gridDim.y, blockDim.y);

/*............*/

	  RAY_2D *row = (RAY_2D *)((char*)d_output.ptr + z * d_output.pitch);

	  //make sure these are set, but only once per ray

	  if(row[ri].r < 0.0f)

	  {

		row[ri].x = x;

		row[ri].y = y;

		row[ri].r = r;

		//row[ri].d = d;

	  }

	  //what to do here, d depends on di (y thread)

	  row[ri].max_d = help_me_atomic_max(row[ri].max_d, d);

This page describes a cute trick for transforming a float into a sortable integer:

http://www.stereopsis.com/radix.html

It is a one-to-one function between unsigned ints and single precision floats, so you can:

  1. Transform float to this int representation.
  2. Use the unsigned int version of atomicMax.
  3. When you are done, you can read out the max value from global memory and convert it back to a float.

(NB: this might not behave properly with inf, nan, and possibly denormal floats.)

thanks, man

Cool trick.

From my understanding of the floating point representation, i think this should also work for denormal numbers and inf. NaN has unusual behavior with relational operators so the integerized version might be not quite the same for NaN.