I did some tests with deepstream-app:
/opt/nvidia/deepstream/deepstream/bin/deepstream-app -c /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/source2_1080p_dec_infer-resnet_demux_int8.txt
There were 32 rtsp sources with fps of 30 in source2_1080p_dec_infer-resnet_demux_int8.txt.Each source had a resolution of 1920*1080.Part of the configs:
[streammux]
batch-size=32
batched-push-timeout=33000
width=1920
height=1080
enable-padding=0
buffer-pool-size=16
[primary-gie]
batch-size=32
interval=15
Osd and sinks were disabled.
To slow down the detector, I made it sleep for 400ms in nvinfer’s code:
static GstFlowReturn
gst_nvinfer_process_full_frame (GstNvInfer * nvinfer, GstBuffer * inbuf,
NvBufSurface * in_surf)
{
NvOSD_RectParams rect_params;
NvDsBatchMeta *batch_meta = NULL;
guint num_filled = 0;
std::unique_ptr<GstNvInferBatch> batch = nullptr;
GstBuffer *conv_gst_buf = nullptr;
GstFlowReturn flow_ret;
GstNvInferMemory *memory = nullptr;
gdouble scale_ratio_x, scale_ratio_y;
guint offset_left = 0, offset_top = 0;
gboolean skip_batch;
/* Process batch only when interval_counter is 0. */
skip_batch = (nvinfer->interval_counter++ % (nvinfer->interval + 1) > 0);
if (skip_batch) {
return GST_FLOW_OK;
}
usleep(400 * 1000); // sleep for 400ms and do nothing;
return GST_FLOW_OK;
...
When running with nvstreammux configs of width=1920, height=1080(the same with rtsp source), the average fps from PERF print was 24.66(1920x1080_log.txt (14.0 KB)), while when width=1920,height=1088(different from rtsp source), the average fps was 25.15(1920x1088_log.txt (14.0 KB)), which was faster than 1920*1080.
I added some more code in nvinfer:
static GstFlowReturn
gst_nvinfer_submit_input_buffer (GstBaseTransform * btrans,
gboolean discont, GstBuffer * inbuf)
{
static auto t1 = std::chrono::high_resolution_clock::now();
auto t2 = std::chrono::high_resolution_clock::now();
g_print("gst_nvinfer_submit_input_buffer duration %lu\n",std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count());
t1 = t2;
...
When width=1920, height=1080
1920x1080.log (74.5 KB)
:
...
gst_nvinfer_submit_input_buffer duration 400
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 13
gst_nvinfer_submit_input_buffer duration 1
gst_nvinfer_submit_input_buffer duration 33
gst_nvinfer_submit_input_buffer duration 18
gst_nvinfer_submit_input_buffer duration 19
gst_nvinfer_submit_input_buffer duration 26
gst_nvinfer_submit_input_buffer duration 16
gst_nvinfer_submit_input_buffer duration 22
gst_nvinfer_submit_input_buffer duration 20
gst_nvinfer_submit_input_buffer duration 23
gst_nvinfer_submit_input_buffer duration 18
gst_nvinfer_submit_input_buffer duration 21
...
1920*1088:
1920x1088.log (62.4 KB)
...
gst_nvinfer_submit_input_buffer duration 400
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 0
gst_nvinfer_submit_input_buffer duration 2
gst_nvinfer_submit_input_buffer duration 34
gst_nvinfer_submit_input_buffer duration 33
gst_nvinfer_submit_input_buffer duration 34
gst_nvinfer_submit_input_buffer duration 32
...
It seemed that when nvstreammux’s width and height are the same with source’s, it will wait some extra time and slow down the whole pipeline.
Configfile:
source2_1080p_dec_infer-resnet_demux_int8.txt (5.7 KB)