Custom transform plugin crashes

• Hardware Platform (Jetson / GPU): GPU
• DeepStream Version: 5.0
• TensorRT Version 7.1.3.4
• NVIDIA GPU Driver Version (valid for GPU only) 440.33.01
• Issue Type( questions, new requirements, bugs) question
• How to reproduce the issue ?

I wrote a custom plugin minimal. I can’t do an in-place transformation, that’s why I implemented functions gst_minimal_transform and gst_minimal_prepare_output_buffer. It’s based on https://forums.developer.nvidia.com/t/qustion-of-memory-leak-gst-plugin-based-on-dsexample/145151
For this example, I simply use NvBufSurfaceMemSet in the transform function to set the UV-plane in the output buffer to a value representing purple.

Something seems wrong with the buffers on the GPU. The pipeline crashes with different errors, e.g.

Cuda failure: status=1 in CreateTextureObj at line 2562
nvbufsurftransform.cpp:2624: => Transformation Failed -2

or

Caught SIGSEGV
Spinning.  Please run 'gdb gst-launch-1.0 12753' to continue debugging, Ctrl-C to quit, or Ctrl-\ to dump core.

or

[NvTiler::Composite] ERROR: 349; NvBufSurfTransformComposite failed(-3)
0

or others. Also, not all frames are purple.

Two more observations:

  1. I included a delay USLEEP of 10ms in gst_minimal_transform. The pipeline is not crashing with this delay, but none of the frames are purple.

  2. I removed the nvinfer from the pipeline and then even without delay the pipeline doesn’t crash and all frames are purple.

It seems the information in the output buffer is decaying and being overwritten, so I assume the buffers are not allocated properly on the GPU. What am I doing wrong?

NB: I cannot use an in-place plugin like dsexample, since the intended purpose of this plugin is to create surfaces with a different batch and frame size than the input.

This is how I run the pipeline

gst-launch-1.0 videotestsrc pattern=0 ! nvvideoconvert ! "video/x-raw(memory:NVMM)" ! m.sink_0 \
nvstreammux name=m num-surfaces-per-frame=1 batch-size=1 width=800 height=600 ! \
minimal ! \
nvinfer config-file-path= /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! \
nvmultistreamtiler rows=1 columns=1 height=600 width=800 ! \
nvvideoconvert nvbuf-memory-type=3 ! \
nveglglessink

Source code of the plugin is attached.
gstminimal.cpp (11.4 KB)
gstminimal.h (3.2 KB)

It may take some time to investigate your codes. We will be back to you if we find anything.

Fiona, thanks for the quick response.

As additional information, here is my compile command:

g++ -c -o gstminimal.o -fPIC -DDS_VERSION=\"5.0.0\" -I /usr/local/cuda-10.2/include -I ../../includes -I /opt/nvidia/deepstream/deepstream/sources/includes/ -pthread -I/usr/include/gstreamer-1.0 -I/usr/include/orc-0.4 -I/usr/include/gstreamer-1.0 -I/usr/include/opencv -I/usr/include/json-glib-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include gstminimal.cpp
-fPIC -DDS_VERSION="5.0.0" -I /usr/local/cuda-10.2/include -I ../../includes -I /opt/nvidia/deepstream/deepstream/sources/includes/ -pthread -I/usr/include/gstreamer-1.0 -I/usr/include/orc-0.4 -I/usr/include/gstreamer-1.0 -I/usr/include/opencv -I/usr/include/json-glib-1.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include
g++ -o libnvdsgst_minimal.so gstminimal.o -shared -Wl,-no-undefined -L/usr/local/cuda-10.2/lib64/ -lcudart -ldl -lnppc -lnppig -lnpps -lnppicc -lnppidei -L/opt/nvidia/deepstream/deepstream-5.0/lib/ -lnvdsgst_helper -lnvdsgst_meta -lnvds_meta -lnvbufsurface -lnvbufsurftransform -Wl,-rpath,/opt/nvidia/deepstream/deepstream-5.0/lib/ -lgstvideo-1.0 -lgstbase-1.0 -lgstreamer-1.0 -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core -ljson-glib-1.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
cp -rv libnvdsgst_minimal.so /opt/nvidia/deepstream/deepstream-5.0/lib/gst-plugins/

Also, another issue with the code is that I can’t control the framerate. I added the sync=1 option to the sink, but it’s still running significantly faster than 30 FPS.

For framerate, you can set videotestsrc framerate with capsfilter. https://gstreamer.freedesktop.org/documentation/coreelements/capsfilter.html?gi-language=c

I had tried that as well:

gst-launch-1.0 videotestsrc pattern=0 ! "video/x-raw, framerate=30/1" ! nvvideoconvert ! "video/x-raw(memory:NVMM)" ! m.sink_0 nvstreammux name=m num-surfaces-per-frame=1 batch-size=1 width=800 height=600 ! minimal ! nvinfer config-file-path= /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! nvmultistreamtiler rows=1 columns=1 height=600 width=800 ! nvvideoconvert nvbuf-memory-type=3 ! nveglglessink

and also with the videorate plugin.

gst-launch-1.0 videotestsrc pattern=0 ! videorate ! "video/x-raw, framerate=30/1" ! nvvideoconvert ! "video/x-raw(memory:NVMM)" ! m.sink_0 nvstreammux name=m num-surfaces-per-frame=1 batch-size=1 width=800 height=600 ! minimal ! nvinfer config-file-path= /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 ! nvmultistreamtiler rows=1 columns=1 height=600 width=800 ! nvvideoconvert nvbuf-memory-type=3 ! nveglglessink

In both cases I get around 90 FPS on my system (this is with the 10ms delay in minimal, so the theoretical max is 100 FPS).

Without the minimal plugin, both pipelines run at 30 FPS.

Any update on this? The pipeline also crashes if I use a queue instead of nvinfer.

Any update on this?

I’ve tried your code with latest DS5.1, it can work.