TemsorRT Fp16 mode

Hello, I have used the TensorRT 1.0 on TX1 to classify the images. Do I need to do something about the input when using FP16 mode?Like converting input data types to FP16?

You don’t need to worry about casting the data from FP32 to FP16 - when you have enabled FP16 in the TensorRT API, it will automatically add extra layers that zip/unzip the inputs/outputs between FP32/FP16. It also takes care of automatically converting the outputs in addition to the inputs.

All you need to do to enable FP16 is passing nvinfer1::DataType::kHALF parameter to the ICaffeParser, followed by calling IBuilder::setHalf2Mode(). You can see a code example at [url]https://github.com/dusty-nv/jetson-inference[/url] in the tensorNet.cpp source.

Thank you very much for your reply, I still have a problem, I use on TX1 caffe framework implementations VGG forward inference, but performance than TensorRT fp16 mode, the results and the official results do not match,would like to ask fp16 Mode in addition to nvinfer1 :: DataType :: kHALF parameters to ICaffeParser, and then call IBuilder :: setHalf2Mode (), but also need to pay attention to what

Hi,

Please remember to maximize CPU/GPU frequency to have better performance.

sudo ./jetson_clocks.sh

Thanks.

Thank you very much for your reply !Performance improvements, but relative to the caffe framework’s advantages are not obvious, I also have a question, for input batchsize is greater than 1, we just need to put in a batch of multiple images in a contiguous block of storage space, do not need to adjust the example of the “do inference ()” function?Does the buffers pointer variable still point to two pointer variables?Can FP16 be used only when batchisize is even?

Hi,

FP16 can run with odd batch size, but not optimal.
TensorRT will automatically pad it to even odd patch size.

Input tensor of TensorRT is NxWxHxC. The first dimension indicates the batch size.
Different batch size has the different tensor dimension but similar workflow.

By the way, we have two newer TensorRT release. It’s recommended to try it if this is possible for you.

Thanks.