NX & TRT & Jetson-inference - Not setting precision to INT8

a428tm · November 15, 2020, 3:39pm

Hi,

I went through the jetson-inference repo, and I was working with re-train/transfer learning side of the repo - https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md

I was able to train using the dataset I wanted (by downloading it from Open images), and I was able to successfully run the modified detectnet.py

My main goal was to quickly train from provided codes/models/etc to review the performance of TX2 and NX. When I ran it for TX2, the ONNX model was converted to FP16 (which makes sense).

However, when i ran the same program, I noticed that TRT was choosing FP16 precision rather than choosing INT8 with comments as shown below

[TRT] TensorRT version 7.1.3
[TRT] loading NVIDIA plugins…
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16, INT8
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx.1.1.7103.GPU.FP16.engine
[TRT] loading network plan from engine cache… /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx.1.1.7103.GPU.FP16.engine
[TRT] device GPU, loaded /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx
[TRT] Deserialize required 2885213 microseconds.
[TRT]

If possible, I was hoping to learn to make it so that I can make the model run in INT8 mode. I saw that Transfer Learning Toolkit has ways to do it via tlt-converter & tlt-export. Could I use it for this? Or since I am using PyTorch, I cannot.

Would appreciate any pointer.

Thanks!

AastaLLL · November 16, 2020, 3:37am

Hi,

Do you want to run INT8 mode on TX2?
If yes, this is not supported.

INT8 inference requires GPU architecture 6.1 or >7.x.
But TX2 GPU architecture is 6.2.

For more detail, you can check our support matrix below:

Thanks.

a428tm · November 16, 2020, 4:01am

@AastaLLL

Thanks for the quick response.

No. I meant that I want to run INT8 mode on XAVIER NX - repo - jetson-inference/pytorch-ssd.md at master · dusty-nv/jetson-inference · GitHub
Currently, the example I ran automatically chooses FP16.

So i am looking to figure out how to configure TRT to choose INT8 on XAVIER NX

Thanks

dusty_nv · November 16, 2020, 5:10pm

Hi @a428tm, the jetson-inference repo isn’t setup to do the quantization calibration for INT8 models, so it uses FP16 instead.

If you want to use Transfer Learning Toolkit and INT8, you can use DeepStream for the inferencing of those models.

If you just want to quickly test the runtime performance, you can use the trtexec benchmark utility for that (found under /usr/src/tensorrt/bin. It can do both INT8 and FP16.

Topic		Replies	Views
Time of inference in FP16 and FP32 is the same Jetson TX2 tensorrt	20	1680	August 10, 2022
Failed to use INT8 precision mode when using tf-trt on Xavier Jetson AGX Xavier	4	968	October 18, 2021
Object Detection Inference Optimisation Jetson Xavier NX jetson-inference	4	625	April 17, 2023
Deploy Object Detection TF-TRT INT8 with DS Triton DeepStream SDK inference-server-triton	16	1304	October 12, 2021
TRT Uses INT 32 VS INT 16 TensorRT	3	1002	October 12, 2021
Human pose detection model (MoveNet) TensorRT conversion on NVIDIA Jetson Jetson Xavier NX tensorrt , tensorflow , jetson-inference	7	2617	June 16, 2022
Different FP16 inference with tensorrt and pytorch TensorRT	5	4477	October 25, 2021
Inference error while using tensorrt engine on jetson nano Jetson Nano tensorrt , nvbugs	23	3623	April 20, 2022
Unable to inference a trt model in jetson nano/ xavier nx Jetson TX2 tensorrt , jetson-inference	3	979	March 2, 2022
INT8 Calibration with DS 6.3 worse than with DS 6.0 DeepStream SDK tensorrt , jetson , deepstream , tensorrt-model-optimizer	20	82	March 10, 2025

NX & TRT & Jetson-inference - Not setting precision to INT8

Related topics