Benchmarck int8 similar to fp32 on yolov8 from ultralytics


I just install jetpack 5.1.2 to my JNO 8GB. I installed ultralytics and resolved the Pytorch with Cuda.

I started to benchmark yolov8 models from ultralytics package and I have same performance for fp32 and int8 configuration (fp16 is, as expected, half of fp32).

Is this a problem with the int8 support in the jetson nano orin???

Thanks in advance.

from ultralytics.utils.benchmarks import benchmark
benchmark(model=f'', data='coco8.yaml', imgsz=640, int8=True, device=0)

half and int8 variables were modified as expected for the benchmarks:

FP32 Benchmarks complete for on coco8.yaml at imgsz=640 (905.18s)

                   Format Status❔  Size (MB)  metrics/mAP50-95(B)  Inference time (ms/im)
4                TensorRT       ✅       13.6               0.6117                   12.63

FP16 Benchmarks complete for on coco8.yaml at imgsz=640 (919.86s)

                   Format Status❔  Size (MB)  metrics/mAP50-95(B)  Inference time (ms/im)
4                TensorRT       ✅        8.2               0.6092                    7.04

INT8 Benchmarks complete for on coco8.yaml at imgsz=640 (423.97s)

                   Format Status❔  Size (MB)  metrics/mAP50-95(B)  Inference time (ms/im)
4                TensorRT       ✅       13.5               0.6117                   12.61


Do you have the serialized TensorRT engine?
If yes, please test it with trtexec with maximized performance and update.

$ sudo nvpmodel -m 0
$ sudo jetson_inference
$ /usr/src/tensorrt/bin/trtexec --loadEngine=[file]


@fnando1995 also do, sudo jetson_clocks --fan

@AastaLLL This is the first time I have seen “sudo jetson_inference”.
Is it related to dusty’s git package? I’ll try it read on it.

I think I do not have de serialized tensorrt engine. Where can I check that to install? or upgrade?

I do have TensorRT, as it comes with the Jetpack installation, but when trying to use your commands, it outputs this:

jn@ubuntu:~/Documents$ /usr/src/tensorrt/bin/trtexec --loadEngine=/home/jn/Documents/yolov8l.engine
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --loadEngine=/home/jn/Documents/yolov8n.engine
[12/14/2023-13:47:57] [I] === Model Options ===
[12/14/2023-13:47:57] [I] Format: *
[12/14/2023-13:47:57] [I] Model: 
[12/14/2023-13:47:57] [I] Output:

[12/14/2023-13:47:57] [I] Memory Clock Rate: 0.624 GHz
[12/14/2023-13:47:57] [I] 
[12/14/2023-13:47:57] [I] TensorRT version: 8.5.2
[12/14/2023-13:47:57] [I] Engine loaded in 0.211736 sec.
[12/14/2023-13:47:58] [I] [TRT] Loaded engine size: 168 MiB
[12/14/2023-13:47:58] [E] Error[1]: [stdArchiveReader.cpp::StdArchiveReader::32] Error Code 1: Serialization (Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match)
[12/14/2023-13:47:58] [E] Error[4]: [runtime.cpp::deserializeCudaEngine::65] Error Code 4: Internal Error (Engine deserialization failed.)
[12/14/2023-13:47:58] [E] Engine deserialization failed
[12/14/2023-13:47:58] [E] Got invalid engine!
[12/14/2023-13:47:58] [E] Inference set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --loadEngine=/home/jn/Documents/yolov8n.engine

Your “yolov8n.engine” is a serialized engine which is deserialized before the start of inference.

May I ask how did you create your calib cache. If your fp16 is working as expected and I am assuming so is fp32, maybe your int8 calib cache is incorrect.


Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match

The error indicates that your engine file is created on a different environment (either software or hardware).

Since TensorRT optimizes with environment info, please recreate the engine file when you change the device or update the software.


