Inputting and outputting a TensorRT engine with FP16 optimization in C++

Hi,

I have a TensorRT engine of a network optimized to FP16 precision. The original network from which I built the engine has weights represented as FP32. I now want to feedforward the engine with an input tensor. I know all the steps from allocating buffers to memcpying the output device buffers to the host buffers but I am missing some crucial pieces of information about inputting and outputting this TensorRT engine:

  • When I input a tensor to the engine is it implicitly converted to FP16 in the engine before feedforward or I need to manually convert the input tensor from FP32 to FP16?
  • When I extract the output the from the engine is it implicitly converted back to FP32 or I need to convert it back to FP32 manually?

Thanks,

Yinon

Hi,

Input and Output gets automatically converted to FP16 and FP32 respectively by TRT engine.

You can also use reformat free IO which is new feature added in TRT-6 so that you can feed tensor original fp16.
Please refer to below sample:
https://github.com/NVIDIA/TensorRT/blob/07ed9b57b1ff7c24664388e5564b17f7ce2873e5/samples/opensource/sampleReformatFreeIO/README.md

Thanks

Yay!

Thanks for your suggestion. I currently have an earlier version of TRT (either 4 or 5, not sure) on my Jetson Nano but am not sure about updating it to a later version mainly because of the need of backward compatibility to several applications I have. Can updating to TRT 6 break compatibility to applications relying on previous versions of TRT?

Thanks,

Yinon

Hi,

You can just use FP32 input as well and TRT will do the convert job.
It was just an additional alternative.

Since I am not aware of your set-up, will recommend to 1st try with FP32 inputs.

Before updating, please refer the support matrix to check compatibility:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-601/tensorrt-support-matrix/index.html

Also, JetPack 4.3 Developer Preview, which is packaged with TRT6 is available as a beta. https://developer.nvidia.com/jetpack-4_3_DP

Stay tuned for official production release of Jetpack 4.3.

Thanks