Hello, I am trying to use Gstreamer to decode an H.265 video file via NVDEC and then transfer the decoded video frames to CUDA memory.
We are using an RTX 4090 GPU, and the system is running Ubuntu 22.04.
My python code for gst pipelines is as follows:
What is the specific workflow of the pipeline setup mentioned above, and on which buffers or memory are these pipelines executed?
Does video/x-raw(memory:NVMM) exist on the dGPU (4090)? Is it possible to pass the decoded video frames directly to CUDA memory via pointers to avoid the additional transfer overhead of swapping to CPU memory?
For your pipeline, we suggest you to use DeepStream video decoder plugin gst-nvv4l2decoder instead of nvh265dec. The gst-nvvideoconvert is a DeepStream plugin but nvh265dec is not a DeepStream plugin, we do not guarantee the compatibility.
If you use the DeepStream video decoder plugin nvv4l2decoder, the output buffer of the decoder is GPU buffer, and nvvideoconvert works on GPU buffer directly.
video/x-raw(memory:NVMM) is DeepStream specified hardware buffer type. When you use DeepStream in any DeepStream supported platform(dGPU, Jetson, IGX,…), the special hardware buffer will be used among DeepStream plugins.
Within DeepStream, the plugins handle the hardware buffers directly, no GPU to CPU copy exist. The GPU to CPU memory copy only happens when you use non-DeepStream compatible plugins to handle DeepStream output data.
We have tried replacing the plugin with nvv4l2decoder, but the decoding speed did not improve; instead, it increased from 0.037s to 0.16s. Could you please explain the reason for this? Here is the modified code:
pipeline = Gst.parse_launch(f"“”
filesrc location={video_file} !
qtdemux !
queue !
h265parse !
queue !
nvv4l2decoder !
queue !
nvvideoconvert !
queue !
video/x-raw(memory:NVMM) !
queue !
fakesink name=fakesink
“”")
Here is the detailed system configuration.