TensorRT Model load error in Deepstream v6.0.1 with TensorRT 8.5.3.1

Description

Facedetectir ETLT model converted to TensorRT format. TensorRT model gives below error while loading the model.

[02/03/2023-11:36:32] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::37] Error Code 1: Serialization (Serialization assertion safeVersionRead == safeSerializationVersion failed.Version tag does not match. Note: Current Version: 0, Serialized Engine Version: 43)
[02/03/2023-11:36:32] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)

Environment

TensorRT Version: 8.5.3.1
GPU Type: Nvidia Tesla V100
Nvidia Driver Version: 440.33.01
CUDA Version: 11.4
CUDNN Version: 8.2.2
Operating System + Version: Ubuntu 20.04.2 LTS
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.12.0+cu113
Baremetal or Container (if container which image + tag): Deepstream v6.0.1-triton

Relevant Files

nvidia-smi

	- NVIDIA-SMI 440.33.01
	- Driver Version: 440.33.01
	- CUDA Version: 11.4

nvcc --version

    nvcc: NVIDIA (R) Cuda compiler driver
	Copyright (c) 2005-2021 NVIDIA Corporation
	Built on Wed_Jul_14_19:41:19_PDT_2021
	Cuda compilation tools, release 11.4, V11.4.100
	Build cuda_11.4.r11.4/compiler.30188945_0

TensorRT version (pip show tensorrt)

	Name: tensorrt
	Version: 8.5.3.1
	Summary: A high performance deep learning inference library
	Home-page: https://developer.nvidia.com/tensorrt
	Author: NVIDIA Corporation
	Author-email: 
	License: Proprietary
	Location: /usr/local/lib/python3.8/dist-packages
	Requires: nvidia-cublas-cu11, nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11
	Required-by: 

Download Tao Converter (Main Link: TAO Converter | NVIDIA NGC)
- wget --content-disposition ‘https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-converter/versions/v3.22.05_trt8.4_x86/files/tao-converter
- Set execute permission
○ $ chmod +x tao-converter
- Install openssl library using the following command.
○ $ sudo apt-get install libssl-dev
- Export the following environment variables (For an x86 platform)
○ export TRT_LIB_PATH=“/usr/lib/x86_64-linux-gnu”
○ export TRT_INC_PATH=“/usr/include/x86_64-linux-gnu”

Download model (Main Link: FaceDetectIR | NVIDIA NGC)
- wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/tao/facedetectir/versions/pruned_v1.0.1/zip -O facedetectir_pruned_v1.0.1.zip
- Unzip facedetectir_pruned_v1.0.1.zip
- It will have 3 files
○ facedetectir_int8.txt
○ labels.txt
○ resnet18_facedetectir_pruned.etlt

Convert ETLT to TRT
- Convert resnet18_facedetectir_pruned.etlt to TRT
○ $ ./tao-converter facedetectir_pruned_v1.0.1/resnet18_facedetectir_pruned.etlt -k tlt_encode -o output_cov/Sigmoid,output_bbox/BiasAdd -d 3,416,736 -i nchw -m 64 -t fp16 -e facedetectir_pruned_v1.0.1/resnet18_facedetectir_pruned_8531.trt -b 32

        [INFO] [MemUsageChange] Init CUDA: CPU +251, GPU +0, now: CPU 257, GPU 491 (MiB)
		[INFO] [MemUsageSnapshot] Builder begin: CPU 276 MiB, GPU 491 MiB
		[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +358, GPU +166, now: CPU 643, GPU 657 (MiB)
		[INFO] [MemUsageChange] Init cuDNN: CPU +277, GPU +164, now: CPU 920, GPU 821 (MiB)
		[WARNING] Detected invalid timing cache, setup a local cache instead
		[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
		[INFO] Detected 1 inputs and 2 output network tensors.
		[INFO] Total Host Persistent Memory: 71744
		[INFO] Total Device Persistent Memory: 5110784
		[INFO] Total Scratch Memory: 0
		[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 14 MiB, GPU 512 MiB
		[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1252, GPU 963 (MiB)
		[INFO] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1252, GPU 971 (MiB)
		[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1252, GPU 955 (MiB)
		[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1252, GPU 937 (MiB)
		[INFO] [MemUsageSnapshot] Builder end: CPU 1243 MiB, GPU 937 MiB

TRT is generated
facedetectir_pruned_v1.0.1/resnet18_facedetectir_pruned_8531.trt

Steps To Reproduce

Model load gives error as below:

[02/03/2023-11:36:32] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::37] Error Code 1: Serialization (Serialization assertion safeVersionRead == safeSerializationVersion failed.Version tag does not match. Note: Current Version: 0, Serialized Engine Version: 43)
[02/03/2023-11:36:32] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)

Hi,

The above error is happening due to TRT version mismatch.
Please make ensure that the TensorRT model is built and executed on the same TensorRT version (and platform).

Thank you.

@spolisetty
Hi,
Thanks for the quick response.
Yes, the ETLT model is converted to TRT format in same docker container with tensorrt 8.5.3.1.
And Loading the TRT model in same container.

I have upgraded tensorrt with pip install tensorrt.

Thank you.

Hi,

We are moving this post to the TAO Toolkit forum to get better help.

Thank you.

May I know why did you upgrade tensorrt?

@Morganh
Currently we are using tensorrt version 8.0.1.6 with CUDA version 11.4.

When we try to convert some of the Torch models to TensorRT format, it ends up with errors.
Whereas we are able to convert the same Torch model to TensorRT in upgraded tensorrt versions.

That’s one of the primary reason for upgrading tensorrt.

You are running inference with GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream, right? You can comment out deepstream_tao_apps/config_infer_primary_facenet.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub in order to let deepstream generate tensorrt engine. Then the TRT version should be matching between building and executing.

Sure. I will check the steps and try this way.

I have few other models (Torch, onnx) which I tried to convert to TRT in same docker container. But I get various errors such as CUDA initialization error, Driver version mismatch and so on.

What’s the best way to upgrade tensorrt version? and to which version?

Current config
TensorRT version: 8.0.1.6
CUDA version: 11.4
CUDA Driver version: 440.33.01

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

That is not expected. Suggest you to update driver.

sudo apt purge nvidia-driver-440
sudo apt autoremove
sudo apt autoclean

sudo apt install nvidia-driver-520

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.