TensorRT 8 - nvinfer1::CudaRuntimeError GUI application


I have developed an inference pipeline with TensorRT (using a saved ONNX model) that I have used successfully across many versions of TensorRT (v4 → v7). We use TensorRT for inference inside a docker container with NCR, and often use graphical X11 applications from within the container as well. I have been trying to update to TensorRT 8, and have been experiencing some issues.

After modifying our code to be compatible with TensorRT 8, I have had success running inference in our test suites and other headless environments. However, when I run our inference pipeline from a GUI application, I see the below error after closing the application. There are no useful logs from TensorRT, nor are there any errors during the application runtime when actually performing inference and using the GPU.

terminate called after throwing an instance of 'nvinfer1::CudaRuntimeError'
  what():  driver shutting down
Aborted (core dumped)

I’ve tried poking around to find any other useful logging, but have not had any success. The only information I have gleaned is that the error seems to originate in libcudnn according to the GDB backtrace. The crash occurs before our code get’s a chance to call the following functions:


Let me know if you have any ideas or suggestions for where I can look for clues. Thank you!


TensorRT Version: 8.0.3-1+cuda11.3
GPU Type: RTX 2080 Super
Nvidia Driver Version: 470.63.01 (also occurs with latest 460)
CUDA Version: 11.3.1
CUDNN Version:
Operating System + Version: Ubuntu 20.04 (host OS)
Baremetal or Container (if container which image + tag): container built from nvidia/cudagl:11.3.1-runtime-ubuntu18.04

Steps To Reproduce

  1. Perform inference in C++ with TensorRT from a process that also has an X11 window open (e.g. OpenCV window).
  2. Close the window / terminate the process
  3. Observe the crash

Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet


import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging

The issue does not seem to be with the ONNX model itself. It produces no errors with onnx.checker.check_model(...) and runs just fine with TRT 8 when there is no GUI application present.

The problem I mentioned only occurs at program termination when I use the model from within a GUI application (e.g. OpenCV window). This makes me think it is a problem with freeing GPU memory, but it does not occur when we use our same TensorRT inference code without a GUI application.

I’m not able to share the ONNX model or the script I use to run it. I understand that will hinder your ability to help, but that is all proprietary. Here is the verbose output from loading the model with TRT 8. There are no errors and everything seems to work just fine.


Logs looks normal, may be problem is in your inference script with GUI. Error which you reported looks like a GPU driver related. Please make sure you’re allocating and freeing resources correctly.
Hope following may be helpful to you.

Thank you.