Fast float to double conversion

njuffa · August 13, 2015, 6:25pm

In various circumstances, one comes across a situation where some initial approximation is computed in single precision for performance reasons, then the result is up-converted to double precision for the final computation steps. Unfortunately, the float-to-double conversion instruction provided by the GPU is itself one of the slower 64-bit operations. One particular such context would be a double-precision cbrt() implementation.

I was therefore looking for something faster, in particular for “DP lite” platforms like sm_5x, leading to the code below. Maybe others find this useful as well. This translate to five instructions on sm_3x (3 LOPs, 2 shifts) and four instructions on sm_5x (2 shifts, MOV32I, LOP3). A three-instruction sequence should be possible by changing the mask to 0x00ffffff and using __byte_perm(), but this would really narrow the supported range down drastically, to [0.5,8.0). Use of the bitfield insertion instruction may be worth looking at as well, but my understanding is that BFE and BFI are fairly low-throughput instructions, so I tried sticking to simple operations.

/* works correctly if argument is in +/-[2**-15, 2**17), or zero, infinity, NaN */
__device__ __forceinline__ 
double my_fast_float2double (float a)
{
    unsigned int ia = __float_as_int (a);
    return __hiloint2double ((((ia >> 3) ^ ia) & 0x07ffffff) ^ ia, ia << 29);
}

CudaaduC · August 13, 2015, 6:52pm

In the course of programming CUDA particle simulations there is quite a bit of casting to and from 64 bit to 32 bit and back.

As you know for such simulations mixed precision is the new normal, with energy/dose accumulations done in 64 bit, and movement (ray tracing etc) done in 32 bit or less.

This will certainly get used in some optical photon simulations I have been working on recently.

Thanks for posting and the explanation!

njuffa · August 13, 2015, 6:57pm

I was actually motivated to code this up by a question in the [cuda] section of Stackoverflow the other day, where it seemed prudent to calculate initial approximations to the real roots of a cubic polynomial in single precision, then switch to double precision for the final Newton iterations on the approximate roots.

In case anybody was wondering about the version of the above based on __byte_perm(), here it is:

/* works correctly if argument is in +/-[2**-1, 2**3), or zero, infinity, NaN */
double my_fast_float2double (float a)
{
    unsigned int ia = __float_as_int (a);
    return __hiloint2double (__byte_perm (ia >> 3, ia, 0x7210), ia << 29);
}

Both of these implementations also work for negative operands, I have added a “+/-” to the comments to make that clearer. The basic idea here is to simply insert 3-bit strings of all 0s or all 1s as the exponent field is widened from 8 to 11 bits. I haven’t come up with anything faster yet, which doesn’t mean such solutions couldn’t exist.

Topic		Replies	Views
Double- to single-precision conversion on device using a compute device NOT supporting double type CUDA Programming and Performance	3	1388	May 4, 2009
Use float rather than double in a kernel? CUDA Programming and Performance	9	1800	November 14, 2019
More information about double precision in Guide? CUDA Programming and Performance	4	4068	May 30, 2008
double to float conversion CUDA Programming and Performance	1	4138	January 28, 2009
Speed of processing double, float, short CUDA Programming and Performance	1	1940	December 16, 2012
double and integer 64 bits CUDA Programming and Performance	5	3843	March 22, 2007
Problem of "Demoting to float" CUDA Programming and Performance	3	2912	November 15, 2011
Strange change in behaviour between float and double CUDA Programming and Performance	6	1334	April 1, 2009
A doubt... 64- 32- and 24-bit math... CUDA Programming and Performance	4	3361	September 23, 2009
Conversion of double to floats How to convert one double to two floats to use in single precision GP CUDA Programming and Performance	7	8039	December 15, 2008

Fast float to double conversion

Related topics