I was wondering if achieving the following results is possible via a conventional DeepStream pipeline.
We want to add a watermark to the processed video stream prior to inference. In other words, if the original data frame is X and the image frame with the added watermark is X’, we would like to pass X’ down the pipeline for inference.
Suppose we would like to perform object detection using X’ and we have a file-sink, we want DeepStream to output a video stream with the bounding boxes drawn onto X’ instead of the original image frame X.
Would this be possible using DeepStream? If so, could I get an explanation detailing how to achieve this?
Although this is a general question, I will add some basic development environment-related information down below just in case.
Thank you for the response. This could be a company logo.
But the main idea would be to transform the original image frame passed down the pipeline so that the SGIE receives the transformed image (image with watermark).
If the PGIE is an object detector and the sink element is a filesink, then the output video file should be the watermarked video with added bounding boxes for objects detected.
Thank you for the response. I will take a look at the gstreamer plugin to see whether such a plugin for updating image frames down the pipeline is available.
In that case, I was wondering: would something like the following be possible?
Suppose that we have an Autoencoder A which takes in an image X and applies a transformation function A(X) to create X’ which is a tensor with the same dimensions as X, the original input frame.
Would the following scenario be possible? If it is the same scenario as the one I described in my previous question, please let me know.
The main idea is that, after PGIE inference, the goal is to update X with X’ so that the filesink outputs X’ instead of X. The PGIE will perform inference on X and the subsequent SGIEs will receive X’ instead of X as input.
Deepstream only supports video/image/audio inferencing now. The general tensor data inferencing is not supported.
We already have some conversion plugins with HW acceleration in deepstream. So the answer to your question depends on what kind of transformation is “A(X)”. Please specify your requirements.
Let’s say A is a model, such as an Autoencoder. We build the model A and serialize it into TensorRT engine format (.engine) file.
X here is an image frame from a video. In other words, A is a model that performs video inferencing.
In other words, we can think of X as a 4 dimensional tensor with dimensions (B, C, H, W), where B = size of mini-batch, C = number of channels, which is 3 for RGB images, H = height of the image and W = width of the image frame.
The primary goals are as follows
Given: An image frame X.
Do the following:
Inference on X using Autoencoder A, producing A(X) = X’. Here, X’ is the output from the autoencoder, which has dimensions (B, C, H, W), same as the original input frame
Replace or embed X’ into the pipeline so that an object detection model (this would be the SGIE), let’s say O infers on X’ instead of X’ to output top k bounding boxes and class confidence scores.
Output videos with bounding boxes drawn on X’ instead of X in the file sink.
Can network-type=100 be interpreted as classifier, detector, detector? (3 models, with a blank space delimiter ‘’). Or does the value 100 carry some other meaning?
The only thing you need to do is to overwrite the NvBufSurface in the nvinfer src pad probe function
Thank you for the answer. I am guessing that we can overwrite the buffer directly using the GStreamer Python bindings in the nvinfer src pad probe function. Please correct me if I am wrong.
Assuming that I am on the right track, I was wondering if overwriting the image directly inside of the probe function is the right approach and if so, is there a way to ensure that the overwritten data persists down the pipeline.
I am guessing that the code will be structured similarly if this can be achieved using Python GStreamer bindings. For example:
pgie = Gst.ElementFactory.make("nvinfer", "primary-inference")
# Get src pad
pgie_src_pad = pgie.get_static_pad("src")
# add probe function
pgie_src_pad.add_probe(Gst.PadProbeType.BUFFER, pgie_src_pad_buffer_probe, 0)
gst_buffer = info.get_buffer()
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list
while l_frame is not None:
frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
l_obj = frame_meta.obj_meta_list
while l_obj is not None:
obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
# Retrieve NvBufSurface containing image frame data
nvds_buf_surface = pyds.get_nvds_buf_surface(hash(gst_buffer), frame_meta.batch_id)
# How would we overwrite the original image frames with inference output from PGIE?
# e.g. autoencoder_output = get_autoencoder_output()
# replace contents in nvds_buf_surface with autoencoder_output and persist it down the pipeline.
l_obj = l_obj.next
l_frame = l_frame.next
Thank you very much for your time and patience once again.