I have developed an inference pipeline with TensorRT (using a saved ONNX model) that I have used successfully across many versions of TensorRT (v4 → v7). We use TensorRT for inference inside a docker container with NCR, and often use graphical X11 applications from within the container as well. I have been trying to update to TensorRT 8, and have been experiencing some issues.
After modifying our code to be compatible with TensorRT 8, I have had success running inference in our test suites and other headless environments. However, when I run our inference pipeline from a GUI application, I see the below error after closing the application. There are no useful logs from TensorRT, nor are there any errors during the application runtime when actually performing inference and using the GPU.
terminate called after throwing an instance of 'nvinfer1::CudaRuntimeError' what(): driver shutting down Aborted (core dumped)
I’ve tried poking around to find any other useful logging, but have not had any success. The only information I have gleaned is that the error seems to originate in
libcudnn according to the GDB backtrace. The crash occurs before our code get’s a chance to call the following functions:
cudaStreamDestroy(...); cudaFree(...); context->destroy(); engine->destroy();
Let me know if you have any ideas or suggestions for where I can look for clues. Thank you!
TensorRT Version: 8.0.3-1+cuda11.3
GPU Type: RTX 2080 Super
Nvidia Driver Version: 470.63.01 (also occurs with latest 460)
CUDA Version: 11.3.1
CUDNN Version: 184.108.40.206-1+cuda11.3
Operating System + Version: Ubuntu 20.04 (host OS)
Baremetal or Container (if container which image + tag): container built from
- Perform inference in C++ with TensorRT from a process that also has an X11 window open (e.g. OpenCV window).
- Close the window / terminate the process
- Observe the crash