Getting full frames in a pipeline where sources have different resolutions

I see that it’s possible to access the ‘full frame’ like in this example deepstream_python_apps/ at 2931f6b295b58aed15cb29074d13763c0f8d47be · NVIDIA-AI-IOT/deepstream_python_apps · GitHub .

But it’s a frame with nvstreammux resolution, not with the original resolution of the source.

I have different cameras (FHD, 4K, 5Mp) connected to a single pipeline, and I want to get the best quality possible out of each frame, i.e., I want to get original frames and not the frames resized by nvstreammux.

Usually, nvstreammux is used with FHD resolution, and I see no reason to change it since neural networks operate on a smaller resolution, like 120x120 or 416x416. I’m afraid that if I use 4K in nvstreammux, the pipeline will have to make frames up to 10 times smaller in each dimension in order to feed them to NNs. This resizing either will be performance expensive or will decrease frames’ quality that is crucial for the work of NNs.

I’m thinking about two options:

  1. Set the maximum resolution (4K) to nvstreammux. I don’t think it’s a good idea; see comments above. Moreover, this solution will be even worse if I have even bigger resolutions, like 6K.

  2. Attach each original frame to the corresponding gstreamer buffer as a NvDsUserMeta before nvstreammux, and get them at the end of the pipeline. I haven’t seen this approach in examples and not sure whether it’s a good idea.

Are there other ways to access full frames? What way is preferable?

This method is not reliable. Some plugins in the pipeline will generate new buffers and the original buffer will be destroyed in anytime(gstreamer work in asynchronous way). If you pass the handle of the buffer to downstream with reference, you will destroy the life cycle of this buffer and causing buffer pool full and the pipeline will be blocked.

I don’t think there is a perfect way to get the full frames together with inference results. Maybe first way is better while it will cause huge waste and performance degradation.

Thanks for the answer,

I will test the first approach; maybe it will be sufficient for me.

I’m feeding frames to a nvstreammux at pretty low FPS, 5-10 frames per second. Is it a good idea to copy whole frames to an NvDsUserMeta? I.e., to copy not a pointer that may be invalidated but the data itself. May the data be invalidated in this case too?

Are there plans to add this functionality (getting full frames) to Deepstream?

If you use compressed video(e.g. h264) and using HW decoder to decompress the data, the frame will be in a HW buffer(sometimes it is called GPU buffer), I don’t think you can copy data from such buffer(normally the copy is done by CPU, CPU can not access such HW buffer).

This is not a extra functionality. If you want full resolution video, then you just need to use the HW which is powerful enough to support the pipeline to work in full resolution from source to sink. It is not a software limitation or feature. 4k , 6k, 8k resolutions just need more HW capabilities.


Of course, it’s possible now. However, it is a waste of resources when there are sources with different resolutions: If I have one 4K camera and 9 FHD cameras, I have to upsample 9 sources to 4K and then resize 10 4K sources to the NN resolution.

Do I understand correctly that another option is to have a separate pipeline for each resolution?

Current deepstream is based on batch processing. So all videos(images) will be scaled to the same resolution to construct the batch.

This can be a solution.

Ok then,

Thanks for your answers and help.