I just install jetpack 5.1.2 to my JNO 8GB. I installed ultralytics and resolved the Pytorch with Cuda.
I started to benchmark yolov8 models from ultralytics package and I have same performance for fp32 and int8 configuration (fp16 is, as expected, half of fp32).
Is this a problem with the int8 support in the jetson nano orin???
Thanks in advance.
test.py
from ultralytics.utils.benchmarks import benchmark
benchmark(model=f'yolov8n.pt', data='coco8.yaml', imgsz=640, int8=True, device=0)
half and int8 variables were modified as expected for the benchmarks:
FP32 Benchmarks complete for yolov8n.pt on coco8.yaml at imgsz=640 (905.18s)
Format Status❔ Size (MB) metrics/mAP50-95(B) Inference time (ms/im)
4 TensorRT ✅ 13.6 0.6117 12.63
FP16 Benchmarks complete for yolov8n.pt on coco8.yaml at imgsz=640 (919.86s)
Format Status❔ Size (MB) metrics/mAP50-95(B) Inference time (ms/im)
4 TensorRT ✅ 8.2 0.6092 7.04
INT8 Benchmarks complete for yolov8n.pt on coco8.yaml at imgsz=640 (423.97s)
Format Status❔ Size (MB) metrics/mAP50-95(B) Inference time (ms/im)
4 TensorRT ✅ 13.5 0.6117 12.61
Your “yolov8n.engine” is a serialized engine which is deserialized before the start of inference.
May I ask how did you create your calib cache. If your fp16 is working as expected and I am assuming so is fp32, maybe your int8 calib cache is incorrect.