Output tensor values are different from Deepstream and keras, python TensorRt implementation

aanish.p · November 9, 2021, 9:58am

• Hardware Platform (Jetson / GPU) Jetson Nano 4Gb Dev Kit
• DeepStream Version 5.1
• JetPack Version (valid for Jetson only) 4.5.1 rev1
• TensorRT Version 7.1.3

I have a modified MobileNetV2 model with structure is like this

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
mobilenetv2_1.00_224 (Functi (None, 7, 7, 1280)        2257984   
_________________________________________________________________
average_pooling2d_18 (Averag (None, 1, 1, 1280)        0         
_________________________________________________________________
flatten_13 (Flatten)         (None, 1280)              0         
_________________________________________________________________
dense_20 (Dense)             (None, 256)               327936    
_________________________________________________________________
dropout_10 (Dropout)         (None, 256)               0         
_________________________________________________________________
dense_21 (Dense)             (None, 4)                 1028      
=================================================================
Total params: 2,586,948
Trainable params: 2,552,836
Non-trainable params: 34,112
_________________________________________________________________

I am trying to infer from this model in deepstream and the values i get from the output are very different from the values I get when I infer with keras on python or TensorRt on python.
For a measure of the difference I am considering the RMSE (root mean square error)value of a set of 740 frames.

DeepStream vs TensorRT on Python
avg rmse of 740 frames=1.96015
max rmse of 740 frames=2.54657
_____________________________________

Keras python vs TensorRT on Python
avg rmse of 740 frames=0.0087
max rmse of 740 frames=0.016

nvposrt mobilenet - Google Drive this the link to the drive folder with the deepstream app, model file, engine and inference scripts. It also contains the TRT engine conversion script by the name dottrtbuild.py.

AastaLLL · November 10, 2021, 2:06am

Hi,

Could you check if the pre-processing steps in dstest1_pgie_config.txt is identical to the TensorRT case first.
In Deepstream, the equation is y = net scale factor*(x-mean):
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html

Thanks.

aanish.p · November 10, 2021, 5:43am

hello,
I have set the netscalefactor=1 and offsets=0;0;0 that is no preprocessing but still the RMSE is high. Even in the TensorRT case there is no preprocessing and raw rgb inputs are being fed to the engine.

To check what is being fed to the model i made a dummy neural network like this

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
activation (Activation)      (None, 224, 224, 3)       0         
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

which return the input as output. I converted it to TensorRT engine and deployed it in all the scripts and got the output images from keras, tensorrt and deepstream.
Keras and tensorrt gave the exact same image and their rmse was 0.
Deepstream had a very large rmse of 2000+ when compared with both. The image from Deepstream was different as well. note: the netscalingfactor=1 and offsets=0;0;0 in this case as well.
On TensorRT
Screenshot from 2021-11-10 11-02-43

On DeepStream
Screenshot from 2021-11-10 11-02-24

on deepstream its gray and 9 frames appear.

aanish.p · November 12, 2021, 4:41pm

@AastaLLL any update on this ?

AastaLLL · November 15, 2021, 7:05am

Hi,

It’s also recommended to check the down-sample process from the video size into the network input size.
Please set the compute-hw=1 and enable-padding=0 to see if it helps.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvstreammux.html#gst-properties
(This indicates to use GPU for scaling and doesn’t preserve the aspect ration.)

Thanks.

aanish.p · November 15, 2021, 9:15am

checked nvstreammux and also set compute-hw=1 and enable-padding=0 but the results are still the same.

aanish.p · November 18, 2021, 6:51am

@AastaLLL any update on this ?

AastaLLL · November 22, 2021, 4:45am

Hi,

Thanks for the testing.

We need more investigate to figure out the issue.
Will share more information with you after checking the source shared above.

aanish.p · November 24, 2021, 9:16am

@AastaLLL any update ?

AastaLLL · November 29, 2021, 4:35am

Hi,

Thanks for your patience.

We have checked your source and found out you are using the NHWC ONNX model.
Please note that Deepstream expects model to be NCHW format.
(While TensorRT can support both NCHW and NHWC format)

Could you convert the model into NCHW format and try it again?
This can be done by the --inputs_as_nchw config in tf2onnx .

If the issue goes on, since TensorRT doesn’t support portability, could you share the ONNX model with us as well?

Thanks

aanish.p · November 29, 2021, 4:42am

will check and let you know soon

aanish.p · November 29, 2021, 7:35am

I have been using NCHW for deepstream all along since the previous drive link was for the moblenetv2 model and since then i have tried the the simple model where input is the same as output i am attaching a new drive link inputoutputdiscrepency - Google Drive
this contains
1.Deepstream app with simple model
2.simple model .h5, .onnx, .trt
3. DSTRTBS1.txt the model output file from deepstream
4. frametest.py the scrip to view the output from DSTRTBS1.txt
5. keras inference
6. tensorrt inference

note: the deepstream script uses videotestsrc for input.

aanish.p · November 29, 2021, 7:39am

@AastaLLL if it speeds the debugging i can make a new post with the observation from the simple model.
I believe its this issue which is causing the the high rms of the mobilenetv2 model. So if we can answer the simple model then i will get an answer to the mobilenetv2 high rmse issue.

AastaLLL · December 6, 2021, 3:46am

Hi,

Do you use the same TensorRT engine for TensorRT and Deepstream?

In your TRTInferance.py, the buffer is arranged in the NHWC format.

#function to get image from local directory 
def get_img_tensor(i):
    a=np.array(i).astype(np.float32)
    newsize = (224, 224,3)
    a = np.resize(a,newsize)
    a=a.reshape((1,224,224,3))
    return a
...
while success:
    ...
    trt_predictions = predict(get_img_tensor(image)).astype(np.float32)

That’s why we think the engine require NHWC format to output the correct result.

Thanks.

aanish.p · December 6, 2021, 4:35am

no, tenosrrt uses the engine with NHWC and DeepStream uses NCHW. they both are different engines with the same model.

AastaLLL · December 8, 2021, 5:06am

Hi,

Could you try TensorRT + NCHW to see if the result is correct first?
Thanks.

aanish.p · December 8, 2021, 5:23am

sure will do and get back

aanish.p · December 8, 2021, 6:24am

TensorRT + NCHW works fine i am getting the right output.

AastaLLL · December 9, 2021, 3:07am

Hi,

Thanks for the experiment.

Could you update the source accordingly?
So we can check it with the same NCHW engine directly.

aanish.p · December 9, 2021, 4:28am

i have updated the same on the drive for NCHW

Topic		Replies	Views
DeepStream 5.1, PyTorch, MobileNet SSD v1, retained, ONNX - poor performance DeepStream SDK	8	1696	October 12, 2021
ONNX model with Jetson-Inference using GPU Jetson Xavier NX tensorrt , jetson-inference , onnx	38	5581	October 18, 2021
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1381	July 12, 2022
Some PyTorch model with slicing operation fails on inference TensorRT tensorrt , pytorch , onnx , deepstream	2	1413	January 7, 2022
How to get `nvinfer` to be as accurate as TensorRT's API? DeepStream SDK tensorrt , tensorflow , gstreamer , python , deepstream	25	98	November 19, 2024
Tensorflow model acceleration on AGX Jetson AGX Xavier tensorflow	14	1176	October 7, 2022
Classifier result on onnx doesn't match Deepstream result DeepStream SDK tensorrt , tensorflow , nvbugs , onnx	35	3281	October 2, 2021
I am trying to convert the ONNX SSD mobilnet v3 model into TensorRT Engine. I am getting the below error Jetson TX2 tensorrt , tensorflow	24	3661	February 17, 2022
TensorRT get different result in python and c++ TensorRT	21	2833	August 24, 2022
Running a pytorch network converted to ONNX with TensorRT on the TX2 Jetson TX2	24	8798	October 18, 2021

Output tensor values are different from Deepstream and keras, python TensorRt implementation

Related topics