Deepstream-app crash with nvbufsurface: NvBufSurfaceSysToHWCopy error

deepak · July 7, 2020, 2:29pm

Please provide complete information as applicable to your setup.

• Hardware Platform (RTX 2080 Ti)
• DeepStream Version 5.0
• TensorRT Version 7
**• Driver Version: 450.36.06 CUDA Version: 11.0 **

While running deepstream-app with:

4 RTSP Stream sources
4 RTSP Stream sinks (3 Hardware Encoded + 1 Software Encoded, with sink3 synch=1)
YOLOV3 FP16 Mode (from objectDetector_Yolo sample)
Batch Size of 4 for PGIE, Batch Size of 16 for Secondary PGIEs
Secondary inference with Car Make, Car Type, Car Color and Face Detection provided in the sample

Application crashes with different sets of errors:

PERF: 30.04 (30.02) 30.04 (30.02) 30.04 (30.02) 30.04 (30.02)
PERF: 29.94 (30.02) 29.94 (30.02) 29.94 (30.02) 29.94 (30.02)
nvbufsurface: NvBufSurfaceSysToHWCopy: failed in mem copy
nvbufsurface: NvBufSurfaceCopy: failed to copy
ERROR: …/nvdsinfer/nvdsinfer_func_utils.cpp:31 [TRT]: …/rtSafe/cuda/cudaSoftMaxRunner.cpp (111) - Cudnn Error in execute: 8 (CUDNN_STATUS_EXECUTION_FAILED)
nvbufsurface: NvBufSurfaceSysToHWCopy: failed in mem copy
ERROR: nvdsinfer_context_impl.cpp:1420 postprocessing cuda waiting event failed , cuda err_no:719, err_str:cudaErrorLaunchFailure
nvbufsurface: NvBufSurfaceCopy: failed to copy
ERROR in BufSurfacecopy
Cuda failure: status=719 in CreateTextureObj at line 2513
nvbufsurftransform.cpp(2369) : getLastCudaError() CUDA error : Recevied NvBufSurfTransformError_Execution_Error : (719) unspecified launch failure.
1:16:21.839067449 17459 0x5651706964f0 WARN nvinfer gstnvinfer.cpp:1188:gst_nvinfer_input_queue_loop:<secondary_gie_2> error: Failed to queue input batch for inferencing
ERROR: …/nvdsinfer/nvdsinfer_func_utils.cpp:31 [TRT]: FAILED_EXECUTION: std::exception
ERROR: nvdsinfer_backend.cpp:290 Failed to enqueue inference batch
ERROR: nvdsinfer_context_impl.cpp:1408 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_TENSORRT_ERROR
ERROR from sink_sub_bin_encoder3: Failed to process frame.
Debug info: gstv4l2videoenc.c(1220): gst_v4l2_video_enc_handle_frame (): /GstPipeline:pipeline/GstBin:processing_bin_2/GstBin:sink_bin/GstBin:sink_sub_bin3/nvv4l2h264enc:sink_sub_bin_encoder3:
Maybe be due to not enough memory or failing driver
1:16:21.839169040 17459 0x5651706965e0 WARN nvinfer gstnvinfer.cpp:1188:gst_nvinfer_input_queue_loop:<secondary_gie_1> error: Failed to queue input batch for inferencing
ERROR from secondary_gie_2: Failed to queue input batch for inferencing
Debug info: gstnvinfer.cpp(1188): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_2
ERROR from secondary_gie_1: Failed to queue input batch for inferencing
Debug info: gstnvinfer.cpp(1188): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_1
Quitting
GDestroying pipelineERROR from sink_sub_bin_queue3: Internal data stream error.
Debug info: gstqueue.c(988): gst_queue_handle_sink_event (): /GstPipeline:pipeline/GstBin:processing_bin_2/GstBin:sink_bin/GstBin:sink_sub_bin3/GstQueue:sink_sub_bin_queue3:
streaming stopped, reason error (-5)
Cuda failure: status=46 in CreateTextureObj at line 2513
nvbufsurftransform.cpp(2369) : getLastCudaError() CUDA error : Recevied NvBufSurfTransformError_Execution_Error : (46) all CUDA-capable devices are busy or unavailable.
Cuda failure: status=46 in CreateTextureObj at line 2496
nvbufsurftransform.cpp(2369) : getLastCudaError() CUDA error : Recevied NvBufSurfTransformError_Execution_Error : (46) all CUDA-capable devices are busy or unavailable.
Segmentation fault (core dumped)

Another instance:

**PERF: FPS 0 (Avg) FPS 1 (Avg) FPS 2 (Avg) FPS 3 (Avg)
**PERF: 30.04 (30.02) 30.04 (30.02) 30.04 (30.02) 30.04 (30.02)
**PERF: 29.72 (30.02) 29.72 (30.02) 29.72 (30.02) 29.72 (30.02)
**PERF: 30.09 (30.02) 30.09 (30.02) 30.09 (30.02) 30.09 (30.02)
nvbufsurface: NvBufSurfaceSysToHWCopy: failed in mem copy
nvbufsurface: NvBufSurfaceCopy: failed to copy
nvbufsurface: NvBufSurfaceSysToHWCopy: failed in mem copy
nvbufsurface: NvBufSurfaceCopy: failed to copy
ERROR in BufSurfacecopy
nvbufsurface: NvBufSurfaceSysToHWCopy: failed in mem copy
nvbufsurface: NvBufSurfaceCopy: failed to copy
ERROR in BufSurfacecopy
ERROR from sink_sub_bin_encoder2: Failed to process frame.
Debug info: gstv4l2videoenc.c(1220): gst_v4l2_video_enc_handle_frame (): /GstPipeline:pipeline/GstBin:processing_bin_1/GstBin:sink_bin/GstBin:sink_sub_bin2/nvv4l2h264enc:sink_sub_bin_encoder2:
Maybe be due to not enough memory or failing driver
ERROR from sink_sub_bin_encoder1: Failed to process frame.
Debug info: gstv4l2videoenc.c(1220): gst_v4l2_video_enc_handle_frame (): /GstPipeline:pipeline/GstBin:processing_bin_0/GstBin:sink_bin/GstBin:sink_sub_bin1/nvv4l2h264enc:sink_sub_bin_encoder1:
Maybe be due to not enough memory or failing driver
Could not allocate cuda host bufferCould not allocate cuda host bufferSegmentation fault (core dumped)

Any idea what would be causing this?

mchi · July 8, 2020, 3:03pm

why use 1 sw encoded? if there is not this sw encoding, is this issue reproduced?

deepak · July 9, 2020, 1:37pm

Hi

RTX 2080 Ti supports only 3 concurrent encoding sessions. If we enable the 4th sink to use the HW Encoder, then the application throws an error in this regard. So, other sinks (sink3-sink7) use SW Encoder type
The application doesn’t crash if only HW encoding is used (for sink0-sink2). However, all other sinks are disabled in this case (due to above reason)
We are using multiple parallel RTSP output streams (instead of single tiled output)

The detailed configuration and related issue of the RTSP output stream performance is being discussed here:
https://forums.developer.nvidia.com/t/deepstream-app-buffer-caching-observed-when-using-yolov3-with-multiple-rtsp-output-streams/140298/9

However, these two are likely to be independent and hence this issue is being explored separately.

Looking forward to inputs on this.
Thanks.

mchi · July 9, 2020, 2:46pm

is this issue reproduced with only sw encoder?
if yes, can you share the pipeline?
I think this may be related how your sw encoder access the raw buffer for encoding.

deepak · July 9, 2020, 4:29pm

This is reproduced when we use HW and SW Encoders are used together.
When we run with only the SW encoding, the application doesn’t crash (say in about 2 hours of run).

We are using slightly modified deepstream-app (with the latest patch from NVIDIA to fix a crash related to latency):
‘N’ RTSP Streams->Mux->nvinfer(YoloV3)->tracker->osd->demux-> ‘N’ RTSP streams.

YoloV3 is from the objectDetection_Yolo sample.

The config file is available here in the previous message that I posed.

Machine configuration is:
CPU: Ryzen 9 3900X, (12 Cores, 24 Threads)
Memory: 64GB
GPU: ZOTAC GAMING GeForce RTX 2080 Ti Twin Fan 11GB GDDR6

mchi · July 10, 2020, 2:10pm

from the log, seems the pipeline has got running for a while.
And, from above log, is may fail due to out of memory, did you monitor the GPU memory usage when running?
How many fps can your sw encoder handle? And, what’s your target encoding fps in your case?

deepak · July 10, 2020, 3:07pm

Hi,

The throughput when there are no inference is enabled is 30 FPS for 8 x 720p RTSP streams. However, when inference is enabled, then the throughput reduces to 13 FPS, with the same 8 x 720 x 30 FPS RTSP input.

The above remains constant until the crash occurs.

There are couple of error messages seen durign the run:
(deepstream-app:12589): GStreamer-CRITICAL **: 20:15:37.081: gst_buffer_get_sizes_range: assertion ‘GST_IS_BUFFER (buffer)’ failed
**PERF: 12.98 (13.08) 12.98 (13.08) 12.98 (13.08) 12.98 (13.08) 12.98 (13.08) 12.98 (13.08) 12.98 (13.08) 12.98 (13.08)

Finally, when the crash occurs, following is the memory status:
±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 12589 C …stream-app/deepstream-app 1793MiB |
±----------------------------------------------------------------------------+
GPU 00000000:09:00.0: Detected Critical Xid Error
GPU 00000000:09:00.0: Detected Critical Xid Error
Fri Jul 10 20:28:39 2020
±----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06 Driver Version: 450.36.06 CUDA Version: 11.0 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 208… On | 00000000:09:00.0 Off | N/A |
| 69% 78C P2 106W / 250W | 1459MiB / 11018MiB | 38% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

The above is when using the 3 HW ENcoders and 5 SW ENcoders.

mchi · July 10, 2020, 3:34pm

checked the confile file, please try setting the batch-size to be the same as the number of input source.
can your sw encoder support to encode 65fps (5 x 13fps) ?

deepak · July 10, 2020, 3:58pm

Hi,

Yes, the batch-size is same as the number of input sources.
Yes, the CPU has the capacity support 5 x 13 FPS (CPU % used by the deepstream-app is about 23%, Load Average is 2-5 on a AMD Rayzon 3900 CPU with 12 core/24 threads. So there is enough capacity).

The throughput is 13 FPS when we have 3 HW and 5 SW Encoders with mux->nvinfer-> tracker ->demux pipeline running under deepstream-app

deepak · July 15, 2020, 7:31am

Hi @mchi

I see that this is directly related to the driver. Should a bug report be filed?

mchi · July 15, 2020, 1:47pm

Hi @deepak,
sorry, do you mean this issue got fixed by updating the CUDA driver?

Thanks!

deepak · July 15, 2020, 2:34pm

Hi @mchi

No. As this has NOT been fixed so far and/or no reason for the failure is found, I was wondering if this should be reported as a bug to the development team.

mchi · July 15, 2020, 2:40pm

Is it possible to share us a repo?

Thanks!

deepak · July 15, 2020, 2:44pm

Hi,

This is the standard deepstream-app running on:

CPU: Ryzen 9 3900X, (12 Cores, 24 Threads)
Memory: 64GB
GPU: ZOTAC GAMING GeForce RTX 2080 Ti Twin Fan 11GB GDDR6

You have already seen the configuration file.
Hope this helps.

deepak · July 20, 2020, 5:35am

https://drive.google.com/file/d/1dJvVo0FnELeZFEQFgl6LEDDP7ujDxUGH/view?usp=sharing

FYI: This will be available for a short period. If you are unable to access this, please send a private message/e-mail and we will upload again and share it with you.

mchi · July 20, 2020, 7:33am

@deepak
Thanks for the repo. It’s helpful.
We will check and get back to you.

Sorry!
Could you share the repo steps with this package?

deepak · July 20, 2020, 7:44am

Hi,

You could go to the Yolo folder and run the …/deepstream-app/deepstream-app -c <YoloV3-8input-infer-sec123-face-analytics.txt> file (or any other config file that has 4 RTSP inputs).

You can either use RTSP cameras or simulate RTSP streams using:
cvlc sample_1080p_h264.mp4 :sout=#gather:transcode{scodec=none}:rtp{sdp=rtsp://:9000/} :no-sout-all :sout-keep :loop

If this doesn’t work for you, just use the standard deepstream-app provided by NVIDIA and use multiple input and output RTSP streams (as shown in the config file).

mchi · July 25, 2020, 2:14pm

Hi @deepak,
Thanks for your repo, we can rperoduce this issue now.
We will debug it and give you update when we have progress.

Thanks!

deepak · August 5, 2020, 1:25pm

Hello,

Now that DeepStream 5.0 is announced for General Availability, has this been fixed in the latest release?

marmikshah · August 13, 2020, 7:17am

Hey,
I am facing a similar error. Have you managed to find a fix for this?

Thanks.

Topic		Replies	Views
Deepstream 6.1: deepstream-app not working after install DeepStream SDK deepstream61	9	2041	September 5, 2022
Requirements of a host to run a deepstream in a container DeepStream SDK	19	1976	October 12, 2021
New installation Multiple Failues DeepStream SDK	18	1130	June 28, 2022
Deepstream-app 5 Seg Fault DeepStream SDK	16	1451	October 12, 2021
DeepStream API error DeepStream SDK	31	644	July 17, 2023
Customized Deepstream app crashes gst_memory_get_sizes: assertion 'mem != NULL' failed DeepStream SDK	15	2380	September 8, 2021
Is DeepStream4.0 need some optimization?There may be a timestamping problem, or this computer is too slow DeepStream SDK	20	4087	May 5, 2020
[TX2] deepstream 4.0 run failed with my own media file DeepStream SDK	8	1921	October 12, 2021
Deeptream streaming stopped, reason not-negotiated (-4) DeepStream SDK deepstream	3	922	July 26, 2022
DeepStream SDK 5.0 with Nvidia GTX 960(4GB RAM) - deepstream-app throwing Bus error DeepStream SDK	9	734	October 12, 2021

Deepstream-app crash with nvbufsurface: NvBufSurfaceSysToHWCopy error

Related topics