How to improve the performance of new nvstreammux?

1214526999 · December 12, 2024, 6:17am

I am currently developing a project where the input consists of different JPEG images, and the output is object detection information. I am trying to use nvstreammux to collect a batch of images before inference to improve the pipeline throughput.
Below are my test results, where YOLOv8 is used as the inference model. The model’s batch size is always set to match the nvstreammux configuration. The input for filesrc consists of 10,000 480p images (I repeated a single 480p image 10,000 times to simulate the input).

Using the old nvstreammux, batch size set to 4, buffer-pool-size set to 64:
CUDA_VISIBLE_DEVICES=4 USE_NEW_NVSTREAMMUX=no gst-launch-1.0 filesrc location=‘480p.jpg.10000’ ! jpegdec ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=4 width=720 height=480 buffer-pool-size=64 ! nvinfer config-file-path=‘dstest1_pgie_config.txt’ ! fakesink enable-last-sample=False

Execution time: 0:00:30.447701949

Using the old nvstreammux, batch size set to 4, buffer-pool-size at its default value. It can be observed that buffer-pool-size significantly improves pipeline throughput:
CUDA_VISIBLE_DEVICES=4 USE_NEW_NVSTREAMMUX=no gst-launch-1.0 filesrc location=‘480p.jpg.10000’ ! jpegdec ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=4 width=720 height=480 ! nvinfer config-file-path=‘dstest1_pgie_config.txt’ ! fakesink enable-last-sample=False

Execution time: 0:00:42.546280728

Using the new nvstreammux, batch size set to 4. Its throughput is close to test 2:
CUDA_VISIBLE_DEVICES=4 USE_NEW_NVSTREAMMUX=yes gst-launch-1.0 filesrc location=‘480p.jpg.10000’ ! jpegdec ! nvvideoconvert ! ‘video/x-raw(memory:NVMM),format=RGBA’ ! m.sink_0 nvstreammux name=m batch-size=4 config-file-path=‘./stream_mux_config.txt’ ! nvinfer config-file-path=‘dstest1_pgie_config.txt’ ! fakesink enable-last-sample=False
Execution time: 0:00:42.885684190

I prefer using the new nvstreammux because it does not require forced resizing of input images. Since the input images are not of fixed resolution, using the old nvstreammux necessitates resizing the images, and nvinfer cannot be made aware of the original image sizes. This requires extra steps to retrieve the original image dimensions and scale the object coordinates output by nvinfer, which increases the program complexity.

However, the performance of the old nvstreammux is superior to the new one. Could you suggest any configuration adjustments to further improve the performance of the new nvstreammux, or a way to avoid the forced resizing issue when using the old nvstreammux?

• Hardware Platform (GPU) A100
• DeepStream 7.1
• Issue Type( new requirements)

1214526999 · December 12, 2024, 7:08am

btw:
stream_mux_config.txt in test3 like this:
gpu-id=0
max-same-source-frames=4
adaptive-batching=0
max-fps-control=0
overall-max-fps-n=10
overall-max-fps-d=1
overall-min-fps-n=5
overall-min-fps-d=1

Counterintuitively, when I set overall-max-fps-n to 1000 while keeping everything else unchanged, the pipeline took 46~48 seconds to complete.I repeated the experiment multiple times and got the same result.

Fiona.Chen · December 12, 2024, 8:00am

To set nvstreammux batch size to larger value than the source number will cause performance decreasing but not improving.

Please refer to Gst-nvstreammux — DeepStream documentation for how the nvstreammux works.

And please refer to Troubleshooting — DeepStream documentation for some performance tips.

If you want to combine the jpeg pictures to batch, you need to input multiple pictures together. Note, only old nvstreammux works with the following pipeline for we don’t know whether the resolutions of the pictures are the same.
E.G.

gst-launch-1.0 filesrc location=‘test0.jpg’ ! jpegdec ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=4 width=720 height=480 buffer-pool-size=64 ! nvinfer config-file-path=‘dstest1_pgie_config.txt’ ! fakesink enable-last-sample=False filesrc location=‘test1.jpg’ ! jpegdec ! nvvideoconvert ! m.sink_1 filesrc location=‘test2.jpg’ ! jpegdec ! nvvideoconvert ! m.sink_2 filesrc location=‘test3.jpg’ ! jpegdec ! nvvideoconvert ! m.sink_3

1214526999 · December 18, 2024, 7:47am

CUDA_VISIBLE_DEVICES=4 USE_NEW_NVSTREAMMUX=yes  gst-launch-1.0 filesrc location='test3.jpg.100' ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA' ! m.sink_0 nvstreammux name=m  batch-size=4 config-file-path='./stream_mux_config.txt' ! nvinfer config-file-path='dstest1_pgie_config.txt' ! fakesink enable-last-sample=False filesrc location='test3.jpg.100' ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA' ! m.sink_1 filesrc location='test3.jpg.100' ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA' ! m.sink_2 filesrc location='test3.jpg.100' ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA' ! m.sink_3

I tried creating a pipeline with the new nvstreammux element. To ensure compatibility with nvstreammux, I use nvvideoconvert to change the image format from RGB to RGBA. However, I encountered some warnings from nvvideoconvert.

Everything seems to be functioning correctly, except for these warnings. Could you help me understand if there’s something wrong with the pipeline?

0:00:01.149438511 3154528 0x7f6bf0001010 WARN          nvvideoconvert gstnvvideoconvert.c:2101:gst_nvvideoconvert_fixate_caps:<nvvideoconvert1> nvbuf-memory-type property is set based on SRC caps. Property config setting (if any) is overridden!!
0:00:01.149481394 3154528 0x7f6bf0001010 WARN          nvvideoconvert gstnvvideoconvert.c:2106:gst_nvvideoconvert_fixate_caps:<nvvideoconvert1> gpu-id property is set based on SRC caps. Property config setting (if any) is overridden!!
0:00:01.221535177 3154528 0x7f6bf0001530 WARN          nvvideoconvert gstnvvideoconvert.c:2101:gst_nvvideoconvert_fixate_caps:<nvvideoconvert2> nvbuf-memory-type property is set based on SRC caps. Property config setting (if any) is overridden!!
0:00:01.221560057 3154528 0x7f6bf0001530 WARN          nvvideoconvert gstnvvideoconvert.c:2106:gst_nvvideoconvert_fixate_caps:<nvvideoconvert2> gpu-id property is set based on SRC caps. Property config setting (if any) is overridden!!
0:00:01.288674925 3154528 0x7f6bf00012a0 WARN          nvvideoconvert gstnvvideoconvert.c:2101:gst_nvvideoconvert_fixate_caps:<nvvideoconvert3> nvbuf-memory-type property is set based on SRC caps. Property config setting (if any) is overridden!!
0:00:01.288717283 3154528 0x7f6bf00012a0 WARN          nvvideoconvert gstnvvideoconvert.c:2106:gst_nvvideoconvert_fixate_caps:<nvvideoconvert3> gpu-id property is set based on SRC caps. Property config setting (if any) is overridden!!

Fiona.Chen · December 18, 2024, 10:23am

1214526999:

0:00:01.149438511 3154528 0x7f6bf0001010 WARN          nvvideoconvert gstnvvideoconvert.c:2101:gst_nvvideoconvert_fixate_caps:<nvvideoconvert1> nvbuf-memory-type property is set based on SRC caps. Property config setting (if any) is overridden!!
0:00:01.149481394 3154528 0x7f6bf0001010 WARN          nvvideoconvert gstnvvideoconvert.c:2106:gst_nvvideoconvert_fixate_caps:<nvvideoconvert1> gpu-id property is set based on SRC caps. Property config setting (if any) is overridden!!

These warnings will not impact the pipeline just some extra information. It is OK.

Please set the resolution of the nvvideoconvert output if you need to use new nvstreammux.

USE_NEW_NVSTREAMMUX=yes gst-launch-1.0 filesrc location=1.jpg ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA,width=1024,height=768' ! mux.sink_0 nvstreammux name=mux batch-size=4 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! fakesink filesrc location=2.jpg ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA,width=1024,height=768' ! mux.sink_1 filesrc location=3.jpg ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA,width=1024,height=768' ! mux.sink_2 filesrc location=4.jpg ! nvjpegdec ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=RGBA,width=1024,height=768' ! mux.sink_3

Topic		Replies	Views
New nvstreammux hangs the Pipeline DeepStream SDK	8	77	September 10, 2024
Deepstream-parallell-infer-app newstreammux DeepStream SDK jetson-inference , deepstream	16	44	September 23, 2024
DS 5.0.1 nvstreammux batch-size bug? DeepStream SDK	13	2705	October 12, 2021
Can nvv4l2decoder and nvstreammux use independent GPU memory? DeepStream SDK	17	721	June 25, 2023
Nvvideoconvert_issue DeepStream SDK	12	1648	February 28, 2022
how could i use nvstreammux without nvstreamdemux DeepStream SDK	12	3761	September 26, 2018
NvBufSurfTransform failed with error -1 DeepStream SDK tensorrt , gstreamer , python	5	202	June 12, 2024
DeepStream 6.1 new streammux CUDA unified vs. default problem DeepStream SDK	18	1163	October 8, 2022
Visualization bug when using preprocessing and metamux DeepStream SDK deepstream	3	64	August 30, 2024
Nvstreamux process 1 frame in a batch at a time instead of 32 frames DeepStream SDK deepstream	13	59	November 15, 2024

How to improve the performance of new nvstreammux?

Related topics