Description
I’m facing an issue where I can’t perform inference after deserializing a CUDA engine using the TensorRT runtime API in Python.
Environment
root@bfd941539e31:/workspace# python -c “import tensorrt as trt; print(trt.version)”
8.5.3.1
ue Sep 26 10:58:16 2023
±----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
python --version
I want to inquire about using the TensorRT runtime API in Python to achieve the same result as running the following command:
detectnet_v2 inference -e /workspace/FaceDetect_infer.txt -m /workspace/model.trt -r /workspace -i /workspace
When I run this command, it produces output like:
face 0.00 0 0.00 201.901 170.121 400.457 518.732 0.00 0.00 0.00 0.00 0.00 0.00 0.00 27.397
I would like to create Python code using the TensorRT runtime API to achieve the same result.
I’ve installed pycuda, and here’s the code I’ve created:
python
import tensorrt as trt
import cv2
import numpy as np
# Create a Logger object to handle TensorRT's logging.
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
# Create a Runtime object. This object is used to deserialize the serialized ICudaEngine.
runtime = trt.Runtime(TRT_LOGGER)
# Specify the path to the serialized CUDA engine file. This file should be pre-generated.
serialized_engine_path = "model.trt"
# Read the serialized CUDA engine from the file.
with open(serialized_engine_path, "rb") as f:
serialized_engine = f.read()
# Use the Runtime object to deserialize the serialized CUDA engine.
engine = runtime.deserialize_cuda_engine(serialized_engine)
# Check if the deserialized ICudaEngine object is valid.
if engine:
print("Successfully deserialized CUDA Engine!")
else:
print("Failed to deserialize CUDA Engine!")
I’ve created this code, but inference doesn’t seem to work. Is there a solution?