Trouble deserialising a trt engine file

Description

I’m trying to run a simple inference example using a TRT engine that has been converted on the same machine. Since the PyTorch NGC image doesn’t ship with PyCUDA, I installed it with pip install pycuda. I used the example from the quick guide with the following code:

import numpy as np
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit

dev = cuda.Device(0)
ctx = dev.make_context()

try:
    TRT_LOGGER = trt.Logger(trt.Logger.INFO)
    with open("ResNet18Dense.trt", 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
        engine = runtime.deserialize_cuda_engine(f.read())
except:
    print("Not working")

And the code fails with the following:

[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +151, GPU +0, now: CPU 175, GPU 359 (MiB)
[TensorRT] INFO: Loaded engine size: 135 MB
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine begin: CPU 311 MiB, GPU 359 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +243, GPU +100, now: CPU 554, GPU 595 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +245, GPU +102, now: CPU 799, GPU 697 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 799, GPU 679 (MiB)
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine end: CPU 799 MiB, GPU 679 MiB
[TensorRT] INTERNAL ERROR: [defaultAllocator.cpp::free::85] Error Code 1: Cuda Runtime (invalid argument)
[TensorRT] INTERNAL ERROR: [resources.h::operator()::445] Error Code 1: Cuda Driver (invalid device context)
Bus error (core dumped)

Running the code without dev = cuda.Device(0) ctx = dev.make_context() goes to the error message right away without even getting the INFO logs out.

My guess is that the Cuda runtime/driver is not being initialised correctly, but I found precious little information on what the errors mean here. Any help would be much appreciated.

Environment

TensorRT Version : 8.0.1.6
GPU Type : NVIDIA GTX 1080 Ti x 2
Nvidia Driver Version : Host machine has 470.57.02
CUDA Version : 11.4.1
CUDNN Version : 8.2.2.26
Operating System + Version : Ubuntu 20.04
Python Version (if applicable) : 3.8
TensorFlow Version (if applicable) : N/A
PyTorch Version (if applicable) : 1.10.0a0+3fd9dcf
Baremetal or Container (if container which image + tag) : nvcr.io/nvidia/pytorch:21.08-py3

Relevant Files

The trt engine: https://get.station307.com/QgvvHqXEB42/ResNet18Dense.trt
The original ONNX file: https://get.station307.com/LdcEWrk4oe1/model-ResNet18Dense-10-AffSynth12_ONNX.onnx

Steps To Reproduce

  1. Convert the onnx file to TRT using trtexec --onnx=model-ResNet18Dense-10-AffSynth12_ONNX.onnx --saveEngine=ResNet18Dense.trt --explicitBatch
  2. Run the code described above

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!