Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU): GeForce RTX 3090
• DeepStream Version: 6.1
• TensorRT Version: 8.2
• NVIDIA GPU Driver Version (valid for GPU only): 510
• Issue Type( questions, new requirements, bugs): bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing): NA
In the file gstdsexample_optimized.cpp
, in gst_dsexample_submit_input_buffer()
, the batch size can never reach the max_batch_size
when process_full_frame
is set to false.
/**
* Called when element recieves an input buffer from upstream element.
*/
static GstFlowReturn
gst_dsexample_submit_input_buffer (GstBaseTransform * btrans,
gboolean discont, GstBuffer * inbuf)
{
// some code
num_filled = batch_meta->num_frames_in_batch; // up to the number of sources in the pipeline
if (dsexample->process_full_frame)
{
// process full frame
} else{
// process detected object
i++;
// Convert batch and push to process thread
if (batch->frames.size () == dsexample->max_batch_size || i == num_filled) {
// send full-batch
}
}
}
When I set process_full_frame
to false
and max_batch_size
to 4
, batch size can only reach 1 (the number of sources in the pipeline) before a “full-batch” is submitted, even though there are enough detected objects to fill a full batch of size 4.
I added print statements:
g_message("Send full batch! batch-size = %ld, max-batch-size = %d, i= %d, num_filled= %d",
batch->frames.size(), dsexample->max_batch_size, i, num_filled);
g_message("Send non-full batch! batch-size = %ld", batch->frames.size ());
And run ./deepstream-opencv-test file://$HOME/Downloads/deepstream-6.1/samples/streams/sample_720p.mp4
, the output:
Message: 13:04:42.466: Send full batch! batch-size = 1, max-batch-size = 4, i= 1, num_filled= 1
I don’t see it mentioned anywhere that the max-batch-size
must be less than or equal to batch_meta->num_frames_in_batch
when we process detected objects instead of full-frame? What is the point of allocate enough memory to hold max-batch-size
when creating dsexample->inter_buf
and dsexample->batch_insurf
but they will only fill up to the number of sources in the pipeline in non-full-frame mode? Is this expected behaviour when we process detected objects instead of full-frame?
Modified files:
deepstream_opencv_test.c (14.9 KB)
gstdsexample_optimized.cpp (46.1 KB)
I think when process detected objects instead of this code segment:
/* Adding a frame to the current batch. Set the frames members. */
GstDsExampleFrame frame;
frame.scale_ratio_x = scale_ratio;
frame.scale_ratio_y = scale_ratio;
frame.obj_meta = obj_meta;
frame.frame_meta = nvds_get_nth_frame_meta (batch_meta->frame_meta_list, i);
frame.frame_num = frame.frame_meta->frame_num;
frame.batch_index = i;
frame.input_surf_params = in_surf->surfaceList + i;
batch->frames.push_back (frame);
i++;
// Convert batch and push to process thread
if (batch->frames.size () == dsexample->max_batch_size || i == num_filled)
{
// code
}
To allow the batch to reach max_batch_size
, it should be:
/* Adding a frame to the current batch. Set the frames members. */
GstDsExampleFrame frame;
frame.scale_ratio_x = scale_ratio;
frame.scale_ratio_y = scale_ratio;
frame.obj_meta = obj_meta;
frame.frame_meta = nvds_get_nth_frame_meta (batch_meta->frame_meta_list, frame_meta->batch_id);
frame.frame_num = frame.frame_meta->frame_num;
frame.batch_index = frame_meta->batch_id;
frame.input_surf_params = in_surf->surfaceList + frame_meta->batch_id;
batch->frames.push_back (frame);
i++; // not needed in process detected objects mode
// Convert batch and push to process thread
if (batch->frames.size () == dsexample->max_batch_size)
{
// code
}