NX & TRT & Jetson-inference - Not setting precision to INT8

Hi,

I went through the jetson-inference repo, and I was working with re-train/transfer learning side of the repo - https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md

I was able to train using the dataset I wanted (by downloading it from Open images), and I was able to successfully run the modified detectnet.py

My main goal was to quickly train from provided codes/models/etc to review the performance of TX2 and NX. When I ran it for TX2, the ONNX model was converted to FP16 (which makes sense).

However, when i ran the same program, I noticed that TRT was choosing FP16 precision rather than choosing INT8 with comments as shown below

[TRT] TensorRT version 7.1.3
[TRT] loading NVIDIA plugins…
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16, INT8
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx.1.1.7103.GPU.FP16.engine
[TRT] loading network plan from engine cache… /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx.1.1.7103.GPU.FP16.engine
[TRT] device GPU, loaded /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx
[TRT] Deserialize required 2885213 microseconds.
[TRT]

If possible, I was hoping to learn to make it so that I can make the model run in INT8 mode. I saw that Transfer Learning Toolkit has ways to do it via tlt-converter & tlt-export. Could I use it for this? Or since I am using PyTorch, I cannot.

Would appreciate any pointer.

Thanks!

Hi,

Do you want to run INT8 mode on TX2?
If yes, this is not supported.

INT8 inference requires GPU architecture 6.1 or >7.x.
But TX2 GPU architecture is 6.2.

For more detail, you can check our support matrix below:

Thanks.

@AastaLLL

Thanks for the quick response.

No. I meant that I want to run INT8 mode on XAVIER NX - repo - https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md
Currently, the example I ran automatically chooses FP16.

So i am looking to figure out how to configure TRT to choose INT8 on XAVIER NX

Thanks

Hi @a428tm, the jetson-inference repo isn’t setup to do the quantization calibration for INT8 models, so it uses FP16 instead.

If you want to use Transfer Learning Toolkit and INT8, you can use DeepStream for the inferencing of those models.

If you just want to quickly test the runtime performance, you can use the trtexec benchmark utility for that (found under /usr/src/tensorrt/bin. It can do both INT8 and FP16.

1 Like