Suggestions on porting a image classification pipeline to DeepStream (Solved)

Hello,

I am working on a system which does the following:

  1. Capture video frame
  2. Split frame into multiple images, with each representing an ROI
  3. Perform image classification on each image
  4. The aggregation of classifications on each of the images corresponding to a frame is the output for that frame

Block diagram:
https://drive.google.com/open?id=19kKIepy_gq3pl_HgT4-tscpP1MuJPitl

I went through the nvgstiva app, but since the above mentioned is an unconventional model I am finding it difficult to come with a design to port it to DeepStream. Can you please provide some suggestions for the deepstream pipeline.

  1. How does opencv preprocess to seperating frame to multiple images? Do you have detection model or just classification model?
  2. What’s platform you will use?
  3. multiple sources or just one channel?

<<How does opencv preprocess to seperating frame to multiple images? Do you have detection model or just classification model?>>
I have ROI co-ordinates in hand

Algorithm:

For each ROI co-ordinate:
   Create a mask image with all zeroes.(except the ROI) using cv2.fillpoly
   masked_image = frame & mask
   perform inference masked_image

Iam using Image classification model, AlexNet.

<< What’s platform you will use?>>
Jetson-TX2

<<multiple sources or just one channel?>>
One channel (atleast for now)

I think it’s OK to port it to deepstream. You need to implement your own applicaton but don’t use nvgstiva app.
The pipeline is like this
opencv capture source + preprocess, masked_image ->
appsrc plugin ->
gst-nvinfer (sgie)plugin ->
gst-nvosd …-> get your metadata in application.
I guess the video source is raw data so do not need decoding.
For gst-nvinfer, set your own network/model properties.

If you don’t have much gstreamer experience, I suggest you use low level TensorRT directly. It’s easy for your case.

Thanks, I appreciate you taking time to respond. Couple of more questions please

  1. What will be the sink for the gstreamer pipeline?
  2. Also, will there be any performance difference between directly using Tensor-RT vs the mentioned gstreamer pipeline.
  1. It can be encoding+filesink, eglsink to display, fakesink and so on what you like.
  2. Using Tensor-RT directly is more flexible and easier to do optimization. Deepstream is a framework which makes use of low level video codec, tensorRT, nvosd, nvtracker, …