Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.1.1
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.6.1
• NVIDIA GPU Driver Version (valid for GPU only) 535.161.07
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
Hi, I was running my deepstream application as a docker container(nvcr.io/nvidia/deepstream:6.1-samples) on a GPU server(Ubuntu 20.04).
I was running 8 applications on each container.
It was working fine for a couple of weeks but suddenly, 5 applications stopped with the following error while the rest of them was working fine.
CUDA_ERROR_UNKNOWN
ERROR: [TRT]: 1: Unexpected exception std::exception
ERROR: nvdsinfer_backend.cpp:506 Failed to enqueue trt inference batch
ERROR: nvdsinfer_context_impl.cpp:1650 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_TENSORRT_ERROR
139:44:20.703143289 1 0x321fd20 WARN nvinfer gstnvinfer.cpp:1338:gst_nvinfer_input_queue_loop:<nvinfer_0> error: Failed to queue input batch for inferencing
2024-05-30T02:42:07Z: [CORE] |ERROR |: [3570721400 ] gst-stream-error-quark: Failed to queue input batch for inferencing (1): gstnvinfer.cpp(1338): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline0/GstNvInfer:nvinfer_0
GPUassert: unknown error src/modules/NvMultiObjectTracker/context.cpp 197
The device-id doesn’t seem to be the problem since the container with and without error was working on the same device.