Cuda error when running multiple pipelines one after another

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Good morning,

I have been facing an issue for a few months that I am not trying to fix. We have internal software that goes through a list of models and runs a pipeline one after another. After running 3-7 pipelines in a row, I sometimes get this error below. The config associated with the pipeline works as when I re-run the pipeline that had an error there is no issue - it only happens when I queue multiple pipelines one after another.

My guess is that between each pipeline there is a memory allocation that sometimes is not released. Can you advise on how to fix this issue? This issue occurs when running tlt-converter or right before a pipeline is started

We are using deepstream-5.1 with TLT/TAO models

ERROR: nvdsinfer_context_impl.cpp:1573 Failed to synchronize on cuda copy-coplete-event, cuda err_no:700, err_str:cudaErrorIllegalAddress
0:00:01.229666287   976      0x24a3ed0 WARN                 nvinfer gstnvinfer.cpp:2021:gst_nvinfer_output_loop:<primary-inference> error: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR
0:00:01.229755945   976      0x24a3ed0 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::releaseBatchOutput() <nvdsinfer_context_impl.cpp:1599> [UID = 1]: Tried to release an unknown outputBatchID
Error: gst-stream-error-quark: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR (1): gstnvinfer.cpp(2021): gst_nvinfer_output_loop (): /GstPipeline:pipeline0/GstNvInfer:primary-inference

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

• Hardware Platform (Jetson / GPU)
T4 & 3090
• DeepStream Version
5.1
• JetPack Version (valid for Jetson only)
• TensorRT Version
7.2
• NVIDIA GPU Driver Version (valid for GPU only)
470
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
have a script that iterates over 2 videos and run deepstream on that video at a time. This will start a pipeline as soon as a pipeline has finished.
for vid in vid_folder:
run_deepstream(vid)

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
cuda illegal access memory

Have you tried the case on the latest DeepStream 6.1 version?

How can we reproduce the issue in our side?

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.