Hi,
I went through the jetson-inference repo, and I was working with re-train/transfer learning side of the repo - https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md
I was able to train using the dataset I wanted (by downloading it from Open images), and I was able to successfully run the modified detectnet.py
My main goal was to quickly train from provided codes/models/etc to review the performance of TX2 and NX. When I ran it for TX2, the ONNX model was converted to FP16 (which makes sense).
However, when i ran the same program, I noticed that TRT was choosing FP16 precision rather than choosing INT8 with comments as shown below
[TRT] TensorRT version 7.1.3
[TRT] loading NVIDIA plugins…
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16, INT8
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx.1.1.7103.GPU.FP16.engine
[TRT] loading network plan from engine cache… /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx.1.1.7103.GPU.FP16.engine
[TRT] device GPU, loaded /home/z5-xnx-dev-1/jetson-inference/python/training/detection/ssd/models/lpr/ssd-mobilenet.onnx
[TRT] Deserialize required 2885213 microseconds.
[TRT]
If possible, I was hoping to learn to make it so that I can make the model run in INT8 mode. I saw that Transfer Learning Toolkit has ways to do it via tlt-converter & tlt-export. Could I use it for this? Or since I am using PyTorch, I cannot.
Would appreciate any pointer.
Thanks!