Inference with tensorRT using half precision floats on TX2 yields NAN outputs

I have a tensorflow model that I convert to UFF format before importing it into tensorRT using the C++ API on the TX2.

I am following the example samples/sampleUffMNIST which has the following code to use either float32’s or float16’s:

#if 0
    if (!parser->parse(uffFile, *network, nvinfer1::DataType::kFLOAT))
        RETURN_AND_LOG(nullptr, ERROR, "Fail to parse");
    if (!parser->parse(uffFile, *network, nvinfer1::DataType::kHALF))
        RETURN_AND_LOG(nullptr, ERROR, "Fail to parse");

When using float32’s, I verified that I get the same result on a test input as when performing inference directly from python using tensorflow, but when I turn on the float16 mode as shown above, my output (which is an array of two float32 numbers) is two NAN’s. What are reasons that could explain this behavior? Using the sampleUffMNIST example, I verified that I do get very similar outputs when using float32 and float16’s.

I am using tensorRT 4.0 and my network is a pretty standard vgg16 network.

Thank you!


Do you also get a nan result with our standard MNIST sample?

One possible issue is that you get a less accurate result with FP16 mode since the model doesn’t train with half precision.
And this inaccurate is enlarged by the layer like softmax or relu.
It’s recommended to output the tensor before softmax to check it.