TemsorRT Fp16 mode

xwbbd2016 · October 18, 2017, 12:36pm

Hello, I have used the TensorRT 1.0 on TX1 to classify the images. Do I need to do something about the input when using FP16 mode?Like converting input data types to FP16?

dusty_nv · October 18, 2017, 1:16pm

You don’t need to worry about casting the data from FP32 to FP16 - when you have enabled FP16 in the TensorRT API, it will automatically add extra layers that zip/unzip the inputs/outputs between FP32/FP16. It also takes care of automatically converting the outputs in addition to the inputs.

All you need to do to enable FP16 is passing nvinfer1::DataType::kHALF parameter to the ICaffeParser, followed by calling IBuilder::setHalf2Mode(). You can see a code example at [url]https://github.com/dusty-nv/jetson-inference[/url] in the tensorNet.cpp source.

xwbbd2016 · October 19, 2017, 12:15am

Thank you very much for your reply, I still have a problem, I use on TX1 caffe framework implementations VGG forward inference, but performance than TensorRT fp16 mode, the results and the official results do not match,would like to ask fp16 Mode in addition to nvinfer1 :: DataType :: kHALF parameters to ICaffeParser, and then call IBuilder :: setHalf2Mode (), but also need to pay attention to what

AastaLLL · October 19, 2017, 2:37am

Hi,

Please remember to maximize CPU/GPU frequency to have better performance.

sudo ./jetson_clocks.sh

Thanks.

xwbbd2016 · October 20, 2017, 1:38am

Thank you very much for your reply ！Performance improvements, but relative to the caffe framework’s advantages are not obvious, I also have a question, for input batchsize is greater than 1, we just need to put in a batch of multiple images in a contiguous block of storage space, do not need to adjust the example of the “do inference ()” function?Does the buffers pointer variable still point to two pointer variables?Can FP16 be used only when batchisize is even？

AastaLLL · October 20, 2017, 2:27am

Hi,

FP16 can run with odd batch size, but not optimal.
TensorRT will automatically pad it to even odd patch size.

Input tensor of TensorRT is NxWxHxC. The first dimension indicates the batch size.
Different batch size has the different tensor dimension but similar workflow.

By the way, we have two newer TensorRT release. It’s recommended to try it if this is possible for you.

TensorRT2.1: install via JetPack3.1
TensorRT3.0: install JetPack3.1 and upgrade with this package.

Thanks.

Topic		Replies	Views
TensorRT on TX1 with jetpack 2.3.1 FP16 mode support Jetson TX1	4	684	October 18, 2021
Half2Mode (fast FP16) on TX1 with TensorRT 2.1 doesn't seem to work Jetson TX1	8	1730	October 18, 2021
Which layers of TensorRT will work in fp16 mode when enable the --half2 option? Jetson TX1	2	546	October 18, 2021
How to run caffe fp16? Jetson TX1	7	3070	October 18, 2021
which layers of TensorRT will work in fp16 mode when enable the --half2 option? Jetson TX2	1	1009	March 17, 2017
FP16 mode is not running faster than FP32 mode TensorRT	0	938	February 11, 2019
running caffe with 16fp precision Jetson Nano	7	842	October 15, 2021
FP32 and FP16 imagenet Jetson TX2	3	875	October 18, 2021
TensorRT Inferencing using TF-TRT framework FP32 vs FP16 Jetson AGX Orin tensorrt	6	349	June 3, 2024
FP16 integration in custom API implementation TensorRT	0	629	June 19, 2018

TemsorRT Fp16 mode

Related topics