Description
Facedetectir ETLT model converted to TensorRT format. TensorRT model gives below error while loading the model.
[02/03/2023-11:36:32] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::37] Error Code 1: Serialization (Serialization assertion safeVersionRead == safeSerializationVersion failed.Version tag does not match. Note: Current Version: 0, Serialized Engine Version: 43)
[02/03/2023-11:36:32] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)
Environment
TensorRT Version: 8.5.3.1
GPU Type: Nvidia Tesla V100
Nvidia Driver Version: 440.33.01
CUDA Version: 11.4
CUDNN Version: 8.2.2
Operating System + Version: Ubuntu 20.04.2 LTS
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.12.0+cu113
Baremetal or Container (if container which image + tag): Deepstream v6.0.1-triton
Relevant Files
nvidia-smi
- NVIDIA-SMI 440.33.01
- Driver Version: 440.33.01
- CUDA Version: 11.4
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Wed_Jul_14_19:41:19_PDT_2021
Cuda compilation tools, release 11.4, V11.4.100
Build cuda_11.4.r11.4/compiler.30188945_0
TensorRT version (pip show tensorrt)
Name: tensorrt
Version: 8.5.3.1
Summary: A high performance deep learning inference library
Home-page: https://developer.nvidia.com/tensorrt
Author: NVIDIA Corporation
Author-email:
License: Proprietary
Location: /usr/local/lib/python3.8/dist-packages
Requires: nvidia-cublas-cu11, nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11
Required-by:
Download Tao Converter (Main Link: TAO Converter | NVIDIA NGC)
- wget --content-disposition ‘https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-converter/versions/v3.22.05_trt8.4_x86/files/tao-converter’
- Set execute permission
○ $ chmod +x tao-converter
- Install openssl library using the following command.
○ $ sudo apt-get install libssl-dev
- Export the following environment variables (For an x86 platform)
○ export TRT_LIB_PATH=“/usr/lib/x86_64-linux-gnu”
○ export TRT_INC_PATH=“/usr/include/x86_64-linux-gnu”
Download model (Main Link: FaceDetectIR | NVIDIA NGC)
- wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/tao/facedetectir/versions/pruned_v1.0.1/zip -O facedetectir_pruned_v1.0.1.zip
- Unzip facedetectir_pruned_v1.0.1.zip
- It will have 3 files
○ facedetectir_int8.txt
○ labels.txt
○ resnet18_facedetectir_pruned.etlt
Convert ETLT to TRT
- Convert resnet18_facedetectir_pruned.etlt to TRT
○ $ ./tao-converter facedetectir_pruned_v1.0.1/resnet18_facedetectir_pruned.etlt -k tlt_encode -o output_cov/Sigmoid,output_bbox/BiasAdd -d 3,416,736 -i nchw -m 64 -t fp16 -e facedetectir_pruned_v1.0.1/resnet18_facedetectir_pruned_8531.trt -b 32
[INFO] [MemUsageChange] Init CUDA: CPU +251, GPU +0, now: CPU 257, GPU 491 (MiB)
[INFO] [MemUsageSnapshot] Builder begin: CPU 276 MiB, GPU 491 MiB
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +358, GPU +166, now: CPU 643, GPU 657 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +277, GPU +164, now: CPU 920, GPU 821 (MiB)
[WARNING] Detected invalid timing cache, setup a local cache instead
[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[INFO] Detected 1 inputs and 2 output network tensors.
[INFO] Total Host Persistent Memory: 71744
[INFO] Total Device Persistent Memory: 5110784
[INFO] Total Scratch Memory: 0
[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 14 MiB, GPU 512 MiB
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1252, GPU 963 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1252, GPU 971 (MiB)
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1252, GPU 955 (MiB)
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1252, GPU 937 (MiB)
[INFO] [MemUsageSnapshot] Builder end: CPU 1243 MiB, GPU 937 MiB
TRT is generated
○ facedetectir_pruned_v1.0.1/resnet18_facedetectir_pruned_8531.trt
Steps To Reproduce
Model load gives error as below:
[02/03/2023-11:36:32] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::37] Error Code 1: Serialization (Serialization assertion safeVersionRead == safeSerializationVersion failed.Version tag does not match. Note: Current Version: 0, Serialized Engine Version: 43)
[02/03/2023-11:36:32] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)