Unable to Perform Inference After Deserializing CUDA Engine


I’m facing an issue where I can’t perform inference after deserializing a CUDA engine using the TensorRT runtime API in Python.


root@bfd941539e31:/workspace# python -c “import tensorrt as trt; print(trt.version)”
ue Sep 26 10:58:16 2023
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
python --version

I want to inquire about using the TensorRT runtime API in Python to achieve the same result as running the following command:

detectnet_v2 inference -e /workspace/FaceDetect_infer.txt -m /workspace/model.trt -r /workspace -i /workspace

When I run this command, it produces output like:

face 0.00 0 0.00 201.901 170.121 400.457 518.732 0.00 0.00 0.00 0.00 0.00 0.00 0.00 27.397

I would like to create Python code using the TensorRT runtime API to achieve the same result.

I’ve installed pycuda, and here’s the code I’ve created:


import tensorrt as trt
import cv2
import numpy as np

# Create a Logger object to handle TensorRT's logging.
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

# Create a Runtime object. This object is used to deserialize the serialized ICudaEngine.
runtime = trt.Runtime(TRT_LOGGER)

# Specify the path to the serialized CUDA engine file. This file should be pre-generated.
serialized_engine_path = "model.trt"

# Read the serialized CUDA engine from the file.
with open(serialized_engine_path, "rb") as f:
    serialized_engine = f.read()

# Use the Runtime object to deserialize the serialized CUDA engine.
engine = runtime.deserialize_cuda_engine(serialized_engine)

# Check if the deserialized ICudaEngine object is valid.
if engine:
    print("Successfully deserialized CUDA Engine!")
    print("Failed to deserialize CUDA Engine!")

I’ve created this code, but inference doesn’t seem to work. Is there a solution?


We recommend you try the latest TensorRT version 8.6.1.
Also, please refer to the following document and samples and make sure your inference script is correct.

If you still experience the same error, please share with us the complete verbose logs and minimal issue repro model and script for better debugging.

Thank you.