How to copy and release NvDsBatchMeta to process it in another thread

I am working with DS 6.0 and its Python bindings. I attached a probe to a nvinfer element so that I can get the NvDsBatchMeta and get the bounding boxes data.
Since running code in the probe will stop the entire pipeline (creating a huge bottleneck), I’ld like to perform a deep copy of the NvDsBatchMeta and then process it in a separate thread.
I am performing the copy of NvDsBatchMeta with:

user_event_meta = pyds.nvds_acquire_user_meta_from_pool(batch_meta)
        batch_meta_copied = pyds.NvDsBatchMeta.cast(
            pyds.nvds_batch_meta_copy_func(batch_meta, user_event_meta)
        )

Then, after I am done, I tried to deallocate the data with either one of the two methods:
Method 1:

user_event_meta = pyds.nvds_acquire_user_meta_from_pool(self.batch_meta)
                print("intermediate")
                pyds.nvds_batch_meta_release_func(self.batch_meta, user_event_meta)

Method 2:

pyds.nvds_destroy_batch_meta(self.batch_meta)

However sometime I am getting a segmentation fault error. Am I doing the correct thing? There is no documentation on how to do this. I looked at the example deepstream_python_apps/deepstream_test_4.py at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub and at the documentation here get_segmentation_masks — Deepstream Deepstream Version: 6.0 GA documentation but I couldn’t find a solution.

Is it possible parse meta data to your own memory and process it in another thread?

The problem with what you are suggesting is that only after a few seconds I will know if I need to save a numpy frame image or not. Since I will need to save only a few images, I don’t want to copy each frame to the RAM because it will create a bottleneck (see below). Therefore, I would like to find a way to keep the image on the GPU until I know if I need it or not.

If I do that you are proposing, I would need to copy the numpy image to the RAM each time.

numpy_image = pyds.get_nvds_buf_surface(hash(gst_buffer), frame_meta.batch_id)
numpy_image = numpy_image.copy(). # necessary because the original image will be modified

Copying the numpy image for each frames slow down the entire pipeline by approximately 4x times. But if I don’t perform the .copy() operation the numpy frame will be lost and replaced by newer ones. How could I obtain/generate a NvBufSurfaceCreateParams from a NvDsBatchMeta?

Is there a way I can get a copy of the image on GPU? It should be faster than copying to CPU.

EDIT: looking at the documentation I found NvBufSurfaceCopy but there is not complete example on how to allocate a new buffer and copy the surfaces from NvDsBatchMeta. How can I get the surface from NvDsBatchMeta or NvDsFrameMeta? I’d like to access the raw data on the GPU

EDIT2: a third option would be to copy the gst_buffer since it is what it’s needed by get_nvds_buf_surface. However, I tried to use gst_buffer.copy_deep() but that didn’t help because it actually didn’t perform a deep copy of the frame data: when calling get_nvds_buf_surface passing the copied gst_buffer I got wrong frames.

I think I probably need a Python version of this code Access frame pointer in deepstream-app - #24 by bcao so that I can copy the Surface. I am not exactly 100% sure about what that code does, but it seems that it creates a copy of the surfaces from a gst buffer. Is that the case? If yes, how can I do the same with Python?

It is dump video buffer content.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.