Crop images of detected objects and save them using Service Maker

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 7.0
• TensorRT Version 8.6.1.6-1+cuda12.0
• NVIDIA GPU Driver Version (valid for GPU only) NVIDIA-SMI 555.58.02 Driver Version: 556.12 CUDA Version: 12.5
• Issue Type( questions, new requirements, bugs) Question

Good morning. I am new to Deepstream so I started working with Deepstream v7.0. I am also using WSL2 on Windows 10, working with Deepstream on Docker. Taking into account that I have background in C++, I am using Service Maker to create a pipeline.

My development objective is just to detect people in a specific environment, saving different data and images in a folder structure. I have done that with Python and Pytorch in a proof of concept software, but for industrial deployment, I will use Deepstream.
Following the examples provided for Service Maker, I managed to detect and to track people, and save detection and tracking information to txt files. The next step is to save the detection crops as jpg images, but I was not be able to do it.
I checked some posts about that problem (without using Service Maker), some of them with a possible solution. For example taking into account the example “deepstream_image_meta_test.c”, but I think that this example can’t be used with Service Maker, no sure.
Is there anyone that can give me a hint or an idea about how I can save crops using Service Maker? Or just tell me if it is possible to apply some of the solutions posted but using Service Maker.

Thanks in advance

Service maker does not support your needs currently. You might consider using the sources\apps\sample_apps\deepstream-image-meta-test demo for the moment. We will consider whether to provide similar demo for service maker in the future.

Thanks for the quick response @yuweiw and for considering the possibility of a demo. It would be very helpful.

Just updating some research:
I read again the service maker documentation and there is this example for reading the VideoBuffer:
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_service_maker_intro.html#buffer

I can obtain Format, Width and Height of the data. The format is NV12, so I understand that processing the data somehow within the buffer could be an option.
Do you think that is possible?

Yes. You can get all the bbox information from the probe function. But you still need use the CPU to process the images. There may still be performance issues.

Ok, thank you. I am coding and doing different tests and developments. I will update the post if I get some results.

Hi again @yuweiw .

I managed to get the data from the buffer and I can store the buffer as a jpg. However, as you can imagine, the image is distorted because of the format.
I am using for testing the development, the sample_720p.mp4 file provided by Nvidia inside the containers, and taking into account that the format is NV12, the full size should be 1382400 bytes (if the format is YUV 4.2.0). However the real buffer size is 1658880 bytes.
So, I suppose that there is some additional alignment.

However, after some tests I can’t get a right image in jpg format.

Could you tell me if my assumptions are correct? Or if there are other posts that could clarify the format or the problem?

Thanks again for your help and time.

Yes. You can refer to our FAQ to learn how to dump the image. You can learn how to do the alignment from the FAQ.

Hello,

I will update the post.

Finally, I got my objective of saving images using VideoBuffer class, taking into account the example from documentation:
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_service_maker_intro.html#buffer
It is important to point out a small error within the code (I think, if not correct me), because the code has a “byte” word within the capture brackets for the lambda function, and it should be a variable or just leaving the brackets empty.

I coded a function that takes the data from the p_byte variable (basically the video buffer) and transform it using matrix transformation from YUV to RGB, applying the memory alignment and it works like a charm.

And them, using:
BatchMetadata batch_meta = video_buffer.getBatchMetadata();
I can access to the metadata from the video buffer for getting the bboxes coordinates.

I will ask to the company I work for about the possibility of sharing the code.

However, when I am applying the code to multicamera, I mean, more than one source, I can access to the metadata from the different sources without problem but I can’t access to all the video buffer frames. I can access to the frame from just one of the sources, I think that the one with FrameMetaData.batchId() = 0.

Where are the other frames? Is there any function that returns all the video buffers?

They are all in the GstBuffer. Could you try to use the extract API to get the raw data?

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.