Summary: I can’t get DeepStream to produce correct results for a custom classifier no matter how I set the config for nvinfer.
I custom trained a 2-class ResNet34 classifier with softmax output in PyTorch and saved this model. I’ve converted and saved this model in 32-bit precision to both ONNX (via onnxruntime
) and TRT (via trtexec
). Additionally, I’ve converted the ONNX model to TRT via nvinfer, which is how I’ve set up the DeepStream config.
The models require preprocessing normalisation, and when these are set equivalently, all four of these models produce exactly the same inference softmax output (to within several decimal points accuracy) over a range of images as well as a video. The normalisation used looks like this:
FIXED_OFFSET = 0.449
FIXED_STDDEV = 0.226
transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize([FIXED_OFFSET] * 3, [FIXED_STDDEV] * 3)
])
However, I can’t get nvinfer to output the same results within a DeepStream pipeline, whether the classifier is used for primary inference or secondary inference. (I know the results are wrong because I’m outputting them from NvDsClassifierMeta
and comparing them to inference outside of DS.) There is no image resizing implied by my config, and I’ve checked all sensible variations of the normalisation settings, which to my knowledge are net-scale-factor
, offsets
, and model-color-format
. I use a constant normalisation across all channels so I can’t be getting the RGB channels confused.
The two obvious conclusions are: either I’ve still got something that needs to be changed in the config, or nvinfer is doing something it shouldn’t be. I did notice one bizarre behavior: if I change scaling-filter, the softmax outputs change completely. This is unexpected because there is no need for image/video scaling anywhere in my Deepstream pipeline.
My pipeline is equivalent to this:
gst-launch-1.0 filesrc location=example.avi ! h264parse ! avdec_h264 ! nvvideoconvert ! \
m.sink_0 nvstreammux name=m batch-size=1 batched-push-timeout=40000 width=224 height=224 ! \
nvinfer config-file-path=primary_classification_test.txt unique-id=1 ! nvdsosd ! nveglglessink
The nvinfer config looks like this:
[property]
gpu-id=0
infer-dims=3;224;224
net-scale-factor=0.017352074
offsets=114.495;114.495;114.495
onnx-file=/opt/nvidia/deepstream/deepstream-6.3/sources/project/person_classifier_test.onnx
labelfile-path=secondary_labels.txt
#force-implicit-batch-dim=1
batch-size=1
#model-color-format=0
process-mode=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
is-classifier=1
output-blob-names=predictions/Softmax
#output-blob-names=output
#classifier-async-mode=1
classifier-threshold=0
maintain-aspect-ratio=0
input-object-min-width=0
input-object-min-height=0
#operate-on-gie-id=1
#operate-on-class-ids=0;1;2;3
classifier-type=personclassifier
#scaling-filter=0
scaling-compute-hw=0
What am I missing?
• Hardware Platform (Jetson / GPU) RTX 2080
• DeepStream Version 6.3
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.5.3.0
• NVIDIA GPU Driver Version (valid for GPU only) 530.41.03
• Issue Type( questions, new requirements, bugs) bugs/question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) Described above
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)