Precision is be influenced when adopting the __half(fp16) dataType

Following is the code:

__global__ void warmup(){
    
    __half tmp = 3;
    __half b = 1;
    __half wacc = 0;
    for(int i = 0; i < 75; i++) {
        for(int j = 0; j < 16; j++) {
            printf("index: %d, before: %f ", i, __half2float(wacc));
            wacc = __hadd(wacc, __hmul(tmp, b));
            printf("after: %f\n ", __half2float(wacc));
        }
    }
}

I expect the result to be 3600, however the fllowing is the output:

 index: 74, before: 4072.000000 after: 4076.000000
 index: 74, before: 4076.000000 after: 4080.000000
 index: 74, before: 4080.000000 after: 4084.000000
 index: 74, before: 4084.000000 after: 4088.000000
 index: 74, before: 4088.000000 after: 4092.000000
 index: 74, before: 4092.000000 after: 4096.000000
 index: 74, before: 4096.000000 after: 4100.000000
 index: 74, before: 4100.000000 after: 4104.000000
 index: 74, before: 4104.000000 after: 4108.000000
 index: 74, before: 4108.000000 after: 4112.000000
 index: 74, before: 4112.000000 after: 4116.000000

so I guess the problem may have to do with the function __hadd or __hmul, the error is accumulated in the for cycle.

I also tried the __hfma:     wacc = __hfma(tmp, b, wacc),  replace the code:

wacc = __hadd(wacc, __hmul(tmp, b));

but the output is same as above.

so what can i do if I want to get the result: 3600 instead of the output above when I use the __half datatype? I really need someone’s help!

An FP16 (half) number has 10 mantissa bits or 11 significand bits. That is about enough to store 2-3 decimal digits of resolution. When the numbers you are adding together differ by more than 2-3 decimal digits or decimal orders of magnitude (as they do when you are adding 3 to 3600) then you are not going to get the results you expect. The half format is not suitable (IMO) for such calculations.

One approach would be to switch to FP32. You will run into a similar problem with FP32 numbers if/when you try to add two numbers that differ by more than about 6-7 decimal digits (decimal orders of magnitude).

To use FP16, its even more important (than FP32) to pay attention to scaling of numbers used in the calculations, relative to each other. One “possible method” to get your result to end up at 3600 is to make sure that the value you are adding each time is “within range” of 3600. Rather than adding 3 each time, we could meet this need, for this particular case, by adding 30 each time.

Here is an example:

$ cat t5.cu
#include <cstdio>

#include <cuda_fp16.h>
#include <math.h>

__global__ void warmup(){

    __half tmp = 3;
    __half b = 10;
    __half wacc = 0;
    for(int i = 0; i < 15; i++) {
        for(int j = 0; j < 8; j++) {
            if (i > 13) printf("index: %d, before: %f ", i, __half2float(wacc));
            wacc = __hadd(wacc, __hmul(tmp, b));
            if (i > 13) printf("after: %f\n ", __half2float(wacc));
        }
    }
}

int main(){

    warmup<<<1,1>>>();
    cudaDeviceSynchronize();
}
$ nvcc -o t5 t5.cu -arch=sm_60
$ ./t5
index: 14, before: 3360.000000 after: 3390.000000
 index: 14, before: 3390.000000 after: 3420.000000
 index: 14, before: 3420.000000 after: 3450.000000
 index: 14, before: 3450.000000 after: 3480.000000
 index: 14, before: 3480.000000 after: 3510.000000
 index: 14, before: 3510.000000 after: 3540.000000
 index: 14, before: 3540.000000 after: 3570.000000
 index: 14, before: 3570.000000 after: 3600.000000
$
1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.