Hardware Platform (GPU) GeForce RTX 3090
• DeepStream Version 6.1.0
• TensorRT Version 8.4.1.5
• NVIDIA GPU Driver Version (valid for GPU only) 510.73.05
• Issue Type( questions, new requirements, bugs) question/bug
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) Stated below
I’ve run into a weird issue. When running my pipeline on a single stream, I get the cuda error described below in the middle of the stream (5000+ frames in, always the same frame). However, when I add two or more file streams, the problem disappears. Have you encountered this before? I’m using a modified version of the people analytics tutorial pipeline, unfortunately I can’t share the modified version.
Troubleshooting done so far:
- I’ve removed the tiler element just in case (the problem persists either way)
- I’ve removed any device onto host frame-copying (the problem persists either way)
- I’ve tried multiple of the same source (works fine for some reason).
- Looked around the configs, only saw batch-size 1
I also viewed Nvinfer cuda error - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums which didn’t help (I don’t know much about trt).
I also viewed Facing cuda memory issue - CUDA Developer Tools / CUDA-MEMCHECK - NVIDIA Developer Forums which didn’t help (No scaling parameter found in the pgie config). Perhaps you know which file contains this info? Facing cuda memory issue - #3 by mai.algendy
Error dump:
ERROR: nvdsinfer_context_impl.cpp:1762 Failed to synchronize on cuda copy-coplete-event, cuda err_no:700, err_str:cudaErrorIllegalAddress
0:01:03.437345106 15184 0x70f9520 WARN nvinfer gstnvinfer.cpp:2337:gst_nvinfer_output_loop:<primary-inference> error: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR
0:01:03.437395961 15184 0x70f9520 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::releaseBatchOutput() <nvdsinfer_context_impl.cpp:1796> [UID = 1]: Tried to release an outputBatchID which is already with the context
[AIML1001:15184] *** Process received signal ***
[AIML1001:15184] Signal: Segmentation fault (11)
[AIML1001:15184] Signal code: (128)
[AIML1001:15184] Failing at address: (nil)
Error: gst-stream-error-quark: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR (1): gstnvinfer.cpp(2337): gst_nvinfer_output_loop (): /GstPipeline:pipeline0/GstNvInfer:primary-inference
Exiting app at 2023-02-01 17:02:25.700028
[AIML1001:15184] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7fc0fc553090]
[AIML1001:15184] [ 1] /lib/x86_64-linux-gnu/libcuda.so.1(+0x1601b5)[0x7fbf6dc0b1b5]
[AIML1001:15184] [ 2] /lib/x86_64-linux-gnu/libcuda.so.1(+0x23e0a5)[0x7fbf6dce90a5]
[AIML1001:15184] [ 3] /usr/local/cuda-11.6/lib64/libcudart.so.11.0(+0x135d0)[0x7fc0f6e105d0]
[AIML1001:15184] [ 4] /usr/local/cuda-11.6/lib64/libcudart.so.11.0(cudaEventSynchronize+0x190)[0x7fc0f6e48ce0]
[AIML1001:15184] [ 5] ///opt/nvidia/deepstream/deepstream-6.1/lib/libnvbufsurftransform.so(NvBufSurfTransformSyncObjWait+0x22)[0x7fbf6f5d3df2]
[AIML1001:15184] [ 6] /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_tracker.so(_ZN14ConvBufManager13syncBufferSetEPSt6vectorIP12NvBufSurfaceSaIS2_EEPP25NvBufSurfTransformSyncObj+0x7e)[0x7fbf636cc06e]
[AIML1001:15184] [ 7] [WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[ERROR] 2023-02-01 17:02:25 Error destroying cuda device: `Sо
/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_tracker.so(_ZN13NvTrackerProc12processBatchEv+0x64f)[0x7fbf636d192f]
[AIML1001:15184] [ 8] [WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6de4)[0x7fc0f7177de4]
[AIML1001:15184] [WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[ 9] [WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7fc0fc4f5609]
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[AIML1001:15184] [10] [WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7fc0fc62f133]
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[AIML1001:15184] *** End of error message ***
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
[WARN ] 2023-02-01 17:02:25 (cudaErrorIllegalAddress)
^C^C^[Segmentation fault (core dumped)
Edit:
Tried 2 streams with tiler height 1080 and it threw again, tried 2 streams with tiler height 2 * 1080 and it worked… probably the bbox of some entity is outside the range, will investigate further