Issue with image classification tutorial and testing with deepstream-app

How did you generate the video file for running in deepstream?
Please consider below way.
gst-launch-1.0 multifilesrc location="/tmp/%d.jpg" caps=“image/jpeg,framerate=30/1” ! jpegdec ! x264enc ! avimux ! filesink location=“out.avi”

Hi @Morganh
I didn’t generate the video. I got it from my colleague. Video is not an issue. I successfully used it with onnx model where classification worked correctly on deepstream-app.

To narrow down, you can try to run standalone python script to do inference against the trt engine.
Reference: Inferring resnet18 classification etlt model with python - #41 by Morganh
Per my test result, it can get the same result as tlt-infer.

Hi @Morganh , does this expects images with name 0.jpg, 1.jpg, 2.jpg etc… in the image directory (say temp folder) ?

If you use above command I mentioned, yes, the name 0.jpg, 1.jpg, etc… are expected.

HI @Morganh ,
Using custom python script wouldn’t really work for us. Our application uses nvinfer plugin hence I used NVIDIA’s deepstream-app to validate my model.

@dzmitry.babrovich
Actually TLT internal team are syncing for your case. So I ran several experiments against your dataset.
My result:

  1. With standalone way: Trigger tlt 2.0_py3 docker in host PC, and run standalone python script against trt engine. It can run inference very well against all the 3 classes’ images. It can get the same result as tlt-infer.
  2. With deepstream way: I run it in TX2.
  • modify the offset (as I mentioned above)
  • generate avi file (as I mentioned above)
    It can run inference very well against the “leaked” and “scratched” avi files. Their result is better than tlt-infer.
    But unfortunately for your “good” class avi file, its result is worse than tlt-infer.

Hi @Morganh I have followed you suggestion and can observe the same result as you do: now I have scratched and leaked classes detected but not the good one.

@dzmitry.babrovich
Please add below parameter in your config file (config_as_primary_gie.txt)
It will solve your latest issue. Deepstream will get very good inference result against “good” class avi file too.

scaling-filter=5

“scaling-filter=5 ” means “Specifies GPU-Ignored, VIC-Nicest interpolation.”

For more info about “scaling-filter”, please refer to DeepStream Plugins 5.0
and NVIDIA DeepStream SDK API Reference: NvBufSurfTransform Types and Functions

Enumerator
NvBufSurfTransformInter_Nearest
Specifies Nearest Interpolation Method interpolation.

NvBufSurfTransformInter_Bilinear
Specifies Bilinear Interpolation Method interpolation.

NvBufSurfTransformInter_Algo1
Specifies GPU-Cubic, VIC-5 Tap interpolation.

NvBufSurfTransformInter_Algo2
Specifies GPU-Super, VIC-10 Tap interpolation.

NvBufSurfTransformInter_Algo3
Specifies GPU-Lanzos, VIC-Smart interpolation.

**NvBufSurfTransformInter_Algo4 **
Specifies GPU-Ignored, VIC-Nicest interpolation.

NvBufSurfTransformInter_Default
Specifies GPU-Nearest, VIC-Nearest interpolation.

Dear @Morganh
I can confirm that adding this parameter finally fixes the issue and I have very good inference result.