How I can parallel access the meta data for each stream?

debjit.adak · July 8, 2024, 10:08am

Please provide complete information as applicable to your setup.

**• Hardware Platform ---------> GPU
**• DeepStream Version -----------> 7.0
• TensorRT Version ------------> 8.9
**• NVIDIA GPU Driver Version --------> 545

In deepstream every test app has a “add_probe” and how we can access metadata.My question is If I have 100 stream and extracting the metadata out of it (“frame_meta.pad_index or frame_meta.source_id” ) taking of 100 cameras metadata sequentially It would take more time !
can You suggest us some good approach where I can do parallel processing for all stream and reduce the total time ?

Fiona.Chen · July 9, 2024, 3:23am

After the streams being combined into the batch inside nvstreammux, there is only one batch meta in the pipeline, to read the frame metas from the batch meta takes very little time. Why do you think “taking of 100 cameras metadata sequentially It would take more time !”?

debjit.adak · July 9, 2024, 5:51am

Hi @Fiona.Chen

For each stream I taking frame out of it. It’s taking much more time than without frame. when it’s sequentially doing at the end of 100 stream it’s taking huge time and frame drops also coming.
How am I accessing the frame ?
# if user_data==0:
data_type, shape, strides, dataptr, size = pyds.get_nvds_buf_surface_gpu(hash(gst_buffer), frame_meta.batch_id)
# dataptr is of type PyCapsule → Use ctypes to retrieve the pointer as an int to pass into cupy
ctypes.pythonapi.PyCapsule_GetPointer.restype = ctypes.c_void_p
ctypes.pythonapi.PyCapsule_GetPointer.argtypes = [ctypes.py_object, ctypes.c_char_p]
# Get pointer to buffer and create UnownedMemory object from the gpu buffer
c_data_ptr = ctypes.pythonapi.PyCapsule_GetPointer(dataptr, None)
unownedmem = cp.cuda.UnownedMemory(c_data_ptr, size, owner)
# Create MemoryPointer object from unownedmem, at index 0
memptr = cp.cuda.MemoryPointer(unownedmem, 0)
# Create cupy array to access the image data. This array is in GPU buffer
n_frame_gpu = cp.ndarray(shape=shape, dtype=data_type, memptr=memptr, strides=strides, order=‘C’).get()

like this I’m accessing the frame for each stream and accessing frame is necessary for us.
If you can give some suggestion to taking the frame more faster way or if I can optimised it some other way !!!

Fiona.Chen · July 9, 2024, 7:19am

To copy the video frames one by one out from multiple streams will not be fast even if you copy the frames in parallel.

There are multiple frames in the batch if you set the “batch-size” as to the more than 1 value. Maybe you can consider to copy the frames in parallel threads. threading — Thread-based parallelism — Python 3.12.4 documentation

snehashish.debnath · July 9, 2024, 2:52pm

Hi, as per the example deepstream_test_3.py in pgie_src_pad_buffer_probe. Is there any way we can copy NvDsBatchMeta instead of pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer)) and return Gst.PadProbeReturn.OK. And use that BatchMeta for our business logic afterwards

Consideration. We are handling 100 cameras looping through all frames in batches and then looping through all the objects in each frame is delaying the whole pipeline

Aldso considering the suggested approach use of Thread creates a extra overhead when using it for frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)

Fiona.Chen · July 10, 2024, 1:23am

@snehashish.debnath
Are you the co-worker with the poster of this topic? If not, please create your own topic. Thank you!

debjit.adak · July 10, 2024, 4:26am

Hi @Fiona.Chen

@snehashish.debnath is my co-worker. If you can answer his question that would be great to solve our problem.

Fiona.Chen · July 10, 2024, 8:32am

The batch meta is inside GstBuffer. pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer)) just get the batch meta from GstBuffer. It is very fast and it is the correct and efficient way to get the batch meta.

Fiona.Chen · July 10, 2024, 8:34am

@debjit.adak
I think your request is to get the frames data from the 100 video streams. What will you do with these frames?

snehashish.debnath · July 10, 2024, 9:07am

I have a requirement to run the pipeline at 25 fps with 100 cameras and each camera has a minimum of 20 obj

So while I was sequentially taking the batch meta which means looping through each frame → each object → then from each obj we need to loop again for classifier_meta_list, obj_user_meta_list
Which is a big bottleneck and then only I can return the Gst.PadProbeReturn.OK

So I want to bypass this entire process by copying and returning Gst.PadProbeReturn.OK

Fiona.Chen · July 10, 2024, 9:19am

What is the GPU? What is the video format in the streams? The resolution and framerate?

What do you want to do with the object meta you got? The loop of getting the object meta from the batch meta will not take too much time.

snehashish.debnath · July 10, 2024, 10:46am

With Object meta, then we have our business logic … what we use the object data

Our fps drops from 25fps to 18 fps

Is there any way to copy NvDsBatchMeta and use it later after returning Gst.PadProbeReturn.OK

Fiona.Chen · July 11, 2024, 1:35am

If you do your business logic in the GStreamer probe function(callback), please make sure that the operation should be fast enough, the probe function will block the pipeline.

You’d better implement your business function in other threads than the pipeline thread.

snehashish.debnath · July 11, 2024, 5:48am

I do understand this …but the thread is also an extra overhead, like creating and managing this thread will do everything in a round-robin way.

So I don’t want this bottleneck itself and eliminate the hole waiting by coping the BatchMeta and returning Gst.PadProbeReturn.OK

Fiona.Chen · July 11, 2024, 6:33am

To get the object meta and frame meta from the batch meta itself will not be the bottleneck. You need to make sure your business logic with the metadata will not take too much time, or else, you need to do the business logic in another thread to make sure the metadata and the GstBuffer is released back to the pipeline before your business logic finishes.

As to @debjit.adak mentioned in another topic that you will get every frame data from the pipeline of the 100 streams. I’m wondering why you want to do some operation to every frame outside the pipeline? Can the operation be done inside some customized plugin with CUDA acceleration?

snehashish.debnath · July 11, 2024, 6:44am

I understand that for my business logic, 100 cameras will lead to 100 threads and do everything in a round-robin way.

I want an approach which copies the BatchMeta and returns Gst.PadProbeReturn.OK, and use the BatchMeta later.

@debjit.adak mentioned will get “every frame data from the pipeline of the 100 streams” We are trying a different approach for it. As you mentioned “Can the operation be done inside some customized plugin with CUDA acceleration?”. Pls suggest some links to this approach

Fiona.Chen · July 11, 2024, 7:55am

We will not encourage you to copy the whole batch meta. You may copy some parts which are useful for you by the APIs configure_source_for_ntp_sync — Deepstream Deepstream Version: 7.0 documentation

For the frame data operation, the python API is of no use. You need to customize in c/c++ with Gst-nvdsvideotemplate — DeepStream documentation 6.4 documentation, it is open source.

yingliu · August 9, 2024, 6:16am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

system · August 23, 2024, 6:16am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream6.0 Metadata processing DeepStream SDK camera , cuda , gstreamer	17	603	March 17, 2023
Query on Nvdsbatchmeta when using the deepstream-app DeepStream SDK	6	381	April 20, 2023
Segmentation fault when extracting the frame using a probe DeepStream SDK python	6	1159	June 9, 2023
Large Number of RTSP Streams: Nvstreammux or separate pipeline per camera? DeepStream SDK camera , gstreamer	6	1073	October 8, 2022
Cuda context over multiprocess or threading with 3 cameras Jetson Nano camera , jetson-inference	15	1309	October 12, 2022
Copy DeepStream frame to cv2.cuda_GpuMat object DeepStream SDK	9	823	April 4, 2023
How to Process Multiple Sources Simultaneously in a DeepStream Pipeline DeepStream SDK deepstream	6	39	November 6, 2024
How to copy and release NvDsBatchMeta to process it in another thread DeepStream SDK	5	1375	February 22, 2022
Deepstream 4.0 : Storing NV12 frame buffers to a file from PGIE sink pad callback DeepStream SDK	13	1683	October 12, 2021
Rtsp inference and tracker DeepStream SDK ubuntu , gstreamer , python	13	241	June 29, 2024

How I can parallel access the meta data for each stream?

Related topics