I have a custom trained model converted to .engine file using trtexec for both FP32, FP16 on Jetson Nano and have also used them for inference(example code https://github.com/NVIDIA-AI-IOT/yolo_deepstream).
I now need to convert the same custom model to INT8 for which I need an INT8 calibration cache (.cache) that trtexec --calib=<file> will accept and use.
Environment :
-
Device: NVIDIA Jetson Nano Developer Kit
-
L4T / JetPack: L4T 32.7.1 / JetPack 4.6.1
-
OS: Ubuntu 18.04.6 LTS (kernel 4.9.253-tegra)
-
CUDA: 10.2.300 (arch 5.3)
-
cuDNN: 8.2.1.32
-
TensorRT: 8.2.1.8
-
Python: 3.6.5
Queries:
- how to generate a valid INT8 calibration cache (
calib.cache) fortrtexec - how to confirm the cache was used for generation of the int8 engine file
- why Polygraphy fails on Jetson Nano in this environment (is Python 3.6.5 the blocker?). Is there an alternative?
i intend to use the calib.cache with the trtexec cmd line tool with the flag --calib=calib.cache.
The command i will use for conversion to INT8 is :
/usr/src/tensorrt/bin/trtexec --onnx=/path/to/input/onnx/weight/file --saveEngine=/path/to/output/engine/weight/file --int8 --fp16 --workspace=4096 --profilingVerbosity=detailed --calib=/path/to/input/calbration/data/cache/file