DeepStream - NN with Custom input using nvinfer

Hello,
I’m currently in the process of migrating a legacy application into a Deepstream one. I’ve succesfully got to set up DeepStream with YOLOv5 (using a custom box parser), but I’m also required to migrate a custom in-house NN into a DeepStream Pipeline.

Custom NN Specs
The network is a custom RCNN, it has 3 inputs - image data including background - of shape (bs, 512, 512, 6), control vector that may contain one of 2 values each execution (of size (bs, 1, 1, 128), and a memory vector of size (bs, 128, 128, 1).
The network has 2 outputs - predictions (segmentation) and next memory vector of shape (bs, 128, 128, 1). Memory vector is then fed back to the input in next execution.

What I’ve tried
I’ve successfully got the NN in a TensorRT fp16 format (through an ONNX export) - I ran it and it works perfectly.
Looking for solutions, I’ve seen the fasterRCNN example in the DeepStream SDK - but it seems to be just a totally custom model with custom layer implementation and custom inputs that get set in the first call - though maybe I’m wrong somehow.

[The project is running under DeepStream 6.1.1, using RTX3080/RTX A4000 GPUs]

Is there any way of handling the custom input and output of the model?
Thanks!

Hi @AvivNaaman
Sorry for late response!
Current DeepStream nvinfer does not support multiple inputs, will check if there is other solution.

Thanks.

@mchi Have you had any thoughts about our issue? Perhaps a way to wrap model input/output as a part of nvinfer?

Thanks

HI @AvivNaaman ,
One question, besides inference with nvdsinfer or nvinferserver, how will you do the pre-processing for the three inputs respectively?

The inputs are:

  1. 6-channel batch image (which is actually 2 images, one is the current frame and previous can be switched from time to time), images are converted to RGB, then to float16 (normalized of course)
  2. Control vector - of some constant 3-dim shape for each batch item, one specific value in that vector (for each batch item) is 1/0 - they change periodically after number of executions.
  3. Last state vector - of another 3-dim shape, initially zeros, after first execution the NN returns it - so we can feed it back on the next execution and etc.

The outputs are also highly customizable as mentioned above - although one is simply segmentation map, the other output is the next state vector, that, as said, should be fed into the NN on next execution.

We might also like to trigger the NN for specific streams, on part of the time and other things as well - that’s less urgent.

Thanks again,
Aviv

Hi @AvivNaaman ,
I should understand your requirement. Current DeepStream nvinfer & nvinferserver can support such usage, some deeply customization to the nvinfer is needed.

Can you direct me to some relevant resources or examples.
I have seen the fasterRCNN example that shows usage of custom model, but there’s no custom input creation for regular local nvinfer.
I’ve seen the feed-back example using triton server, but I’m not sure whether we should use it if we are running additional models using standard nvinfer - what do you think about it?
Anyway, an example of feeding back output of NM to its input and customizing the input creation using nvinfer would be awesome.
Thanks!

@mchi, I’d like to have your opinion for the following solution:

  • In the matter of using background as part of the NN input: I thought about using NvBusSurface with a 6-channel buffer. I will copy the current frame into this image buffer, and I will copy the background image data as well - to the respective channels. each image will be of shape (h, w, 6). That will be done using NvDsVideoTemplate plugin.

  • This 6-channel image will be passed to nvdsinferserver.
    In the matter of state vector feeding-back, that may be solved using nvdsinferserver_custom_process-like implementation (as shown in the objectDetector-fasterRCNN example). I will get the 6-channel image, I’ll build the other vectors and feed it like the example.

My current issue is about the background 6-chaneel image data:

  • As shown in NvBufSurface Documentation, The allocation of a 6-channel image is not directly supported - because it’s an unknown color format.
  • I just wanna make sure the custom 6-channel image won’t cause any issues downstream.

Regards,
Aviv.

Hi @AvivNaaman ,

6-channel batch image (which is actually 2 images, one is the current frame and previous can be switched from time to time),

I think, we could set batchsize of nvstreammux to 2 , and then it can pass two successive frames to nvinfer, then after nvinfer receives the two frames, it can process them as a 6-channel batch image.

  1. Control vector - of some constant 3-dim shape for each batch item, one specific value in that vector (for each batch item) is 1/0 - they change periodically after number of executions.

You could add that in nvinfer before enqueuev2()

  1. Last state vector - of another 3-dim shape, initially zeros, after first execution the NN returns it - so we can feed it back on the next execution and etc.

Need to copy the network tensor output into of of the input buffer of enqueuev2(), and use this buffer in next enqueuev2() call.
The related code:
File: /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp
Function: InferPostprocessor::copyBuffersToHostMemory(NvDsInferBatch& batch, CudaStream& mainStream)

Currently, the enqueuev2() in nvinfer only supports one input buffer.

Not sure if I make it little more clearly?

Thanks. We’re currently trying to set up nvinferserver as our GIE for the described NN - it seems to be much easier to implement as a first migration step.
After that, we may try to migrate from triton to local nvinfer with the described changes - as a 2nd optimization step.
Currently closing as solved, and will re-open in case of any further issues.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.