How to manually attach NvDsBatchMeta (like nvstreammux does)

• Hardware Platform: GPU
• DeepStream Version. 6.3
• Issue Type: questions

We are writing a plugin based on GstBaseTransform, which is supposed to take a GstBuffer (already on memory:NVMM) and attach the NvDsBatchMeta and push downstream. It will always be batch-size 1. We can not use nvstreammux,.

This is our current sink template and transform_ip:

static GstStaticPadTemplate gst_nvattachmeta_src_template =
            "{ NV12, RGBA, I420 }")));

static GstStaticPadTemplate gst_nvattachmeta_sink_template =
            "{ NV12, RGBA, I420 }")));

static GstFlowReturn
gst_nvattachmeta_transform_ip (GstBaseTransform * trans, GstBuffer * buf)
  GstNvattachmeta *nvattachmeta = GST_NVATTACHMETA (trans);

  GST_DEBUG_OBJECT (nvattachmeta, "transform_ip");

  NvDsBatchMeta* batch_meta = nvds_create_batch_meta(1);
  NvDsMeta* meta = gst_buffer_add_nvds_meta (buf, batch_meta, NULL, nvds_batch_meta_copy_func, nvds_batch_meta_release_func);
  meta->meta_type = NVDS_BATCH_GST_META;

  return GST_FLOW_OK;

The pipeline is running fine like this, but no processing is being done (e.g. in nvinferserver). Almost like it would be a batch-size 0 buffer. Looking at source code of nvstreammux, it seems like there are additional steps necessary.

Please advice on the correct code to attach the NvDsBatchMeta to a single frame GstBuffer.

It is obvious from the source code that the GstBuffer somehow has to be transformed to a NvBufSurface, but the relevant code is hidden behind a compiled library (GstBatchBufferWrapper in gstnvstreammux_impl.h)


As far as I know, using nvstreammux is a must if you will be performing processing with DeepStream. The element not only adds batch meta, but it also converts regular NVMM memory into a different type of NVMM that allows you to batch several frames into a single GstBuffer. I call it “batched NVMM”, but there is probably a correct name for it. Even if your batch size is one, the DeepStream elements will fail if you operate over regular NVMM. The final elements in a DeepStream pipeline (nvdsosd, nvstreamdemux, nvmultistreamtiler) also convert this batched NVMM back into regular NVMM.

The operations required to get batched NVMM are unavailable, as nvstreammux is not open source and is distributed as a compiled library.

I am curious as to why you can’t use nvstreammux.

Hi @miguel.taylor. Thanks for responding to this.

We are facing synchronization issues with the “new” nvstreammux. Our pipeline looks like this (slightly simplified):

uridecodebin → new nvstreammux batch-size=1 → nvinferserver batch-size=1 → fakesink (with buffer probe)

We only ever process a single stream per pipeline and rely on the buffers to arrive at the fakesink as close to their PTS as possible, but no matter what settings we chose for the new nvstreammux, the buffers either arrive in “chunks”, or completely randomly “jittered” with regard to their PTS.
In DS6.2 we could set sync-inputs=1 on nvstreammux and sync=true on fakesink, and it would fix the issue. But this behavior has changed in DS6.3 and no longer works. It sort of worsened even. We described the issue in detail here and also added measurements: New nvstreammux sync-inputs and max-latency - #7 by
Unfortunately the response was note very helpful, with a few links to already referenced documentation.

The source code for nvstreammux seems to be open now in DS6.3, and the synchronization logic looks rather complicated and ambigous. It is most likely the combination of rtpjitterbuffer and the new nvstreammux which causes the issues. But it does not work properly with sync-inputs=0 either. Currently we consider the quality of the new nvstreammux to still be very beta, which is further emphasized by nvidia responses in this forum completely contradicting what we can read in the source code (e.g. when we asked whether the nvstreammux can drop buffers, which was denied, but we can now find the implementation and multiple comments for it in the source code, it even has a “dropped” signal).

It would be great if someone from Nvidia could shed some light on this. The source code for nvstreammux is available in DS6.3, but the most important struct GstBatchBufferWrapper is still a compiled library and unfortunatly inseparable from the nvstreammux (it takes a pointer to GstNvStreamMux.

It feels like it shouldn’t be so complicated to write a plugin which just attaches the NvDsMeta for a single frame and otherwise completely acts in passthrough mode. That’s all we need.

The best response in this forum so far regarding this “batched” NVMM is maybe this:

Looks like this is how to instantiate a NvBufSurface and wrap it in a GstBuffer via exposing the pointer to NvBufSurface directly to gst_buffer_new_wrapped_full. But I would like some comments from Nvidia first before I try this.

Good to know that nvstreammux is finally open source-ish; I didn’t know that. I agree that it should be straightforward to add metadata to a buffer and map it as batched NVMM. We definitely would need some feedback from NVIDIA to achieve that.

We haven’t seen the sync issue that you described, maybe because we have several interpipes and queues in between that handle synchronization and decouple the pipelines. You could try adding interpipes to your original pipeline to see if that solves the issue.

uridecodebin3 ... ! \
queue leaky=2 max-size-buffers=10 ! \
interpipesink sync=true async=false name=src \
interpipesrc is-live=true allow-renegotiation=true stream-sync=2 listen-to=src ! \
nvstreammux0.sink_0 nvstreammux ...

Thanks for the pointer to interpipesink, looks generally very handy. The RidgeRun wiki really has some useful info, thanks!

I will give it a try for our pipeline, but unfortunately the sync issues only happens after the nvstreammux. Anything before it is a thing of beauty, including using queues and leaky and whatnot. I will see if interpipe will fix some ntp or pts stuff.

I’m waiting for a response from nvidia on the matter anyway, maybe there is a good pointer to some code snippet that will transform memory:NVMM GstBuffer to whatever the internal representation is, most likely something like NvBufSurface with NvBufSurfaceParams. Or somebody has a config for nvstreammux that does the trick.

Just want to clarify that this ticket is not resolved and could use some insights from nvidias end

Through trial and error the solution turned out to be simpler than assumed.

This post has all the necessary information.

I can now also confirm that our custom plugin fixes the synchronization issues in our pipeline that we experienced with nvstreammux (sync-inputs=0 or sync-inputs=1, didn’t make a difference).

We will revisit nvstreammux when it is in a stable release state.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.