Unable to Perform Inference After Deserializing CUDA Engine

mykim4 · September 26, 2023, 10:59am

Description

I’m facing an issue where I can’t perform inference after deserializing a CUDA engine using the TensorRT runtime API in Python.

Environment

root@bfd941539e31:/workspace# python -c “import tensorrt as trt; print(trt.version)”
8.5.3.1
ue Sep 26 10:58:16 2023
±----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
python --version

I want to inquire about using the TensorRT runtime API in Python to achieve the same result as running the following command:

detectnet_v2 inference -e /workspace/FaceDetect_infer.txt -m /workspace/model.trt -r /workspace -i /workspace

When I run this command, it produces output like:

face 0.00 0 0.00 201.901 170.121 400.457 518.732 0.00 0.00 0.00 0.00 0.00 0.00 0.00 27.397

I would like to create Python code using the TensorRT runtime API to achieve the same result.

I’ve installed pycuda, and here’s the code I’ve created:

python

import tensorrt as trt
import cv2
import numpy as np

# Create a Logger object to handle TensorRT's logging.
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

# Create a Runtime object. This object is used to deserialize the serialized ICudaEngine.
runtime = trt.Runtime(TRT_LOGGER)

# Specify the path to the serialized CUDA engine file. This file should be pre-generated.
serialized_engine_path = "model.trt"

# Read the serialized CUDA engine from the file.
with open(serialized_engine_path, "rb") as f:
    serialized_engine = f.read()

# Use the Runtime object to deserialize the serialized CUDA engine.
engine = runtime.deserialize_cuda_engine(serialized_engine)

# Check if the deserialized ICudaEngine object is valid.
if engine:
    print("Successfully deserialized CUDA Engine!")
else:
    print("Failed to deserialize CUDA Engine!")

I’ve created this code, but inference doesn’t seem to work. Is there a solution?

spolisetty · September 27, 2023, 12:03pm

Hi,

We recommend you try the latest TensorRT version 8.6.1.
Also, please refer to the following document and samples and make sure your inference script is correct.

If you still experience the same error, please share with us the complete verbose logs and minimal issue repro model and script for better debugging.

Thank you.

Topic		Replies	Views
Trouble deserialising a trt engine file TensorRT	1	1490	September 5, 2021
RuntimeError: CUDA error: invalid configuration argument TensorRT tensorrt	1	2546	December 1, 2021
Error 3 Cuda initialization while deserializing TensorRT model TensorRT	6	3728	September 7, 2020
TensorRT deployment with engine generated from TLT example TensorRT tensorrt	8	776	December 5, 2020
[error] deserialize_cuda_engine(): incompatible funtion arguments in sample fc_plugin_caffe_mnist TensorRT	7	2324	December 9, 2019
TensorRT waiting after inference seemingly for no reason TensorRT tensorrt , cuda , performance , python	12	1476	October 20, 2022
I have trouble loading trt engine through python api but trtexec seems work fine TensorRT tensorrt	1	450	August 7, 2023
TensorRT runtime engine compatibility between python API and C++ API TensorRT	3	364	July 12, 2021
Tensorrt cold start (First time inference) TensorRT tensorrt , cuda , ubuntu , python	2	266	May 30, 2024
TensorRT do_inference error TensorRT	19	8341	November 14, 2022

Unable to Perform Inference After Deserializing CUDA Engine

Description

Environment

Related topics