When I want to serialize and quantify a yolov8-pose model into INT8 engine, I fail and the error says
Loading weights: yolov8s-pose.wts
[07/31/2024-03:52:54] [W] [TRT] The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.
Your platform support int8: true
Building engine, please wait for a while...
reading calib cache: int8calib.table
[07/31/2024-03:52:59] [E] [TRT] 1: Unexpected exception _Map_base::at
[07/31/2024-03:52:59] [E] [TRT] 2: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
Build engine successfully!
yolov8_pose: /home/djf/tensorrtx/yolov8/yolov8_pose.cpp:31: void serialize_engine(std::string&, std::string&, int&, std::string&, float&, float&, int&): Assertion `serialized_engine' failed.
The model is the official yolov8s-pose model. And here is my results of apt show nvidia-cuda
and apt show nvidia-tensorrt
:
Package: nvidia-cuda
Version: 5.1.2-b104
Priority: standard
Section: metapackages
Maintainer: NVIDIA Corporation
Installed-Size: 229 kB
Depends: cuda-runtime-11-4 (= 11.4.19-1)
Conflicts: cuda-command-line-tools-10-0, cuda-compiler-10-0, cuda-cublas-10-0, cuda-cublas-dev-10-0, cuda-cudart-10-0, cuda-cudart-dev-10-0, cuda-cufft-10-0, cuda-cufft-dev-10-0, cuda-cuobjdump-10-0, cuda-cupti-10-0, cuda-curand-10-0, cuda-curand-dev-10-0, cuda-cusolver-10-0, cuda-cusolver-dev-10-0, cuda-cusparse-10-0, cuda-cusparse-dev-10-0, cuda-documentation-10-0, cuda-driver-dev-10-0, cuda-gdb-10-0, cuda-gpu-library-advisor-10-0, cuda-libraries-10-0, cuda-libraries-dev-10-0, cuda-license-10-0, cuda-memcheck-10-0, cuda-misc-headers-10-0, cuda-npp-10-0, cuda-npp-dev-10-0, cuda-nsight-compute-addon-l4t-10-0, cuda-nvcc-10-0, cuda-nvdisasm-10-0, cuda-nvgraph-10-0, cuda-nvgraph-dev-10-0, cuda-nvml-dev-10-0, cuda-nvprof-10-0, cuda-nvprune-10-0, cuda-nvrtc-10-0, cuda-nvrtc-dev-10-0, cuda-nvtx-10-0, cuda-samples-10-0, cuda-toolkit-10-0, cuda-tools-10-0
Homepage: http://developer.nvidia.com/jetson
Download-Size: 28.1 kB
APT-Manual-Installed: no
APT-Sources: https://repo.download.nvidia.com/jetson/common r35.4/main arm64 Packages
Description: NVIDIA CUDA Meta Package
Package: nvidia-tensorrt
Version: 5.1.2-b104
Priority: standard
Section: metapackages
Maintainer: NVIDIA Corporation
Installed-Size: 205 kB
Depends: tensorrt-libs (= 8.5.2.2-1+cuda11.4)
Conflicts: libnvinfer-plugin6, libnvinfer-plugin7, libnvinfer6, libnvinfer7, libnvonnxparsers6, libnvonnxparsers7, libnvparsers6, libnvparsers7, python-libnvinfer
Homepage: Jetson - Embedded AI Computing Platform | NVIDIA Developer
Download-Size: 27.3 kB
APT-Manual-Installed: no
APT-Sources: https://repo.download.nvidia.com/jetson/common r35.4/main arm64 Packages
Description: NVIDIA TensorRT Meta Package
This problem only happens when I want to quantify model as INT8, it’s okay to quantify model as FP16. Do you have any idea about this?