Speed of FP32 vs FP16

newtume.123 · October 7, 2020, 1:36pm

Hi! I trained YOLO models and then converted them to FP32, FP16 in order to use with deepstream. It seems that there is no speed-up (at least on Jetson nano). Yeah, the engine (and the model) is smaller, but the speed is the same (with res=320,320 and interval=0 it’s about 18 FPS). What’s the reason for that? Is that because NMS only supports FP32 and it slows everything down?

How can I speed-up inference apart from using parameter interval in deepstream config (where I compute preds every frames)

Morganh · October 7, 2020, 4:26pm

How did you convert them to FP32 and FP16? With the command tlt-export or tlt-converter?

More, suggest generating trt engine and then use trtexec to test fps.
Refer to Accelerating Peoplnet with tlt for jetson nano - #13 by Morganh

newtume.123 · October 12, 2020, 9:14am

Well, first - with tlt-export on my desktop and then with tlt-converter on Jetson Nano.

Morganh · October 12, 2020, 9:36am

Please note that

after tlt-export, whatever fp32 or fp16, the etlt model is the same. (Difference in data type specified during tlt export and tlt convert - #5 by Morganh)
please use trtexec to test fps. Before testing, please generate tensorrt engine in Nano with the tool tlt-converter.

Topic		Replies	Views
Why inference in jetson nano with fp16 is slower than fp32 Jetson Nano tensorrt , jetson-inference	9	1958	September 5, 2021
Inference using FP16 and FP32 precision giving no performance gain on Jetson Nano Jetson Nano	2	1351	October 14, 2021
Why I can't get 40 FPS for TLT YOLOv3 ResNet18 FP16 in 320x320? DeepStream SDK tensorrt , performance	7	845	October 12, 2021
No performance improvement on Jetson Nano FP16 vs FP32 TensorRT	6	2691	February 22, 2021
Can't convert to fp16 for detectnet_v2_resnet18 trained model using TLT on desktop gtx 1080 TAO Toolkit	8	840	October 12, 2021
TF/Keras inference 4 times faster with FP32 precision than with FP16 Jetson Nano	8	2685	October 18, 2021
Yolov3 in nanojetson Jetson Nano tensorrt	12	1086	October 18, 2021
Darknet for Jetson Nano Jetson Nano	2	1137	January 22, 2020
Speed up float16 conversion using python Jetson AGX Xavier tensorrt , python , cudnn	6	504	May 7, 2024
Time of inference in FP16 and FP32 is the same Jetson TX2 tensorrt	20	1752	August 10, 2022

Speed of FP32 vs FP16

Related topics