NVIDIA Developer Forums

Deviating results with TensorRT

AI & Data Science Deep Learning (Training & Inference) TensorRT

klicker100 April 25, 2018, 11:11pm 1

Hello everyone!

I am getting deviating results when trying to do inference with TensorRT. I have converted a TensorFlow (Keras) model according to the example from the NVIDIA DevBlog:

My goal is to deploy an application that uses TensorRT on the Jetson TX2. Unfortunately there is no Python API available for TensorRT on Jetson. I have modified this example to fit my needs:

There are already small differences between the prediction output of the Keras model and the TensorRT (Python) results after conversion. When running the image classification application (C++) the results also differ from the first two. I honestly have no idea what causes these differences, but they are in part substantial.

A simple example can be found here:

For testing purposes this is all run on a x64 host!

Log output of “train.py”:

gist.github.com

https://gist.github.com/fischermario/303cd97252b05e6811ad0944e7f44321

log_train_py.txt

Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
2018-04-25 21:27:47.392990: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-25 21:27:47.527610: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-25 21:27:47.528025: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: Quadro M500M major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:06:00.0
totalMemory: 1.96GiB freeMemory: 1.55GiB

This file has been truncated. show original

Log output of “convert.py”:

gist.github.com

https://gist.github.com/fischermario/bd84b9be02ec2a50ecf8472f3cae0b74

log_convert_py.txt

Using output node dense_2/Softmax
Converting to UFF graph
No. nodes: 98
[TensorRT] INFO: UFFParser: parsing input_1
[TensorRT] INFO: UFFParser: parsing block1_conv1/kernel
[TensorRT] INFO: UFFParser: parsing block1_conv1/convolution
[TensorRT] INFO: UFFParser: parsing block1_conv1/bias
[TensorRT] INFO: UFFParser: parsing block1_conv1/BiasAdd
[TensorRT] INFO: UFFParser: parsing block1_conv1/Relu
[TensorRT] INFO: UFFParser: parsing block1_conv2/kernel

This file has been truncated. show original

Log output of “test_keras.py”:

gist.github.com

https://gist.github.com/fischermario/ef3bee494c454c14d9a7e28a5e515e05

log_test_keras_py.txt

Using TensorFlow backend.
Loading network...
2018-04-25 21:40:00.962271: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-25 21:40:01.086140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-25 21:40:01.086521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: Quadro M500M major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:06:00.0
totalMemory: 1.96GiB freeMemory: 1.55GiB
2018-04-25 21:40:01.086540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-25 21:40:01.696561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:

This file has been truncated. show original

Log output of “test_trt.py”:

gist.github.com

https://gist.github.com/fischermario/f5139a0da37186c845104770545a3df7

log_test_trt_py.txt

[TensorRT] INFO: Added linear block of size 12845056
[TensorRT] INFO: Added linear block of size 12845056
[TensorRT] INFO: Added linear block of size 1605632
[TensorRT] INFO: Verifying engine construction was successful
[TensorRT] INFO: Allocating GPU buffers
[TensorRT] INFO: Registering postprocessor function table
[TensorRT] INFO: Detecting input data format
[TensorRT] INFO: Dectected data format NCHW
[TensorRT] INFO: Verifying data format is uniform accross all input layers
[TensorRT] INFO: Verifying batches are the expected data type

This file has been truncated. show original

Log output of “test_classifier.py”:

gist.github.com

https://gist.github.com/fischermario/376b35be6af5f07ceda237b71bf90f0d

log_test_classifier_py.txt

daisy_1.jpg
daisy: 99.82%
daisy_2.jpg
daisy: 99.59%
daisy_3.jpg
daisy: 99.94%
dandelion_1.jpg
dandelion: 99.99%
dandelion_2.jpg
dandelion: 97.69%

This file has been truncated. show original

Thank you!

Greetings,
Mario

SiddharthSharma_TPM April 26, 2018, 9:29pm 2

We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:
https://devtalk.nvidia.com/default/board/301/deep-learning-training-and-inference-/

We are moving active deep learning threads to the new section.

URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.

-Siddharth

Topic		Replies	Views	Activity
tf_to_trt_image_classification with custom model Jetson TX2	4	728	October 18, 2021
TensorRT model inference result is not correctly TensorRT tensorrt , tensorflow , onnx	1	700	July 1, 2022
TensorRT inference produces unexpected results TensorRT	1	558	July 11, 2019
TensorRT result is different from Tensorflow results Frameworks (archived) tensorflow	1	615	January 9, 2021
model accuracy penalty with tensorRT on jetson TX2 TensorRT	0	857	June 7, 2019
The detection capability decreases after the pytorch model was converted to the tensorRT model TensorRT	2	443	September 30, 2022
Differences between tensorflow model inference and tensorRT model inference TensorRT tensorrt , tensorflow	6	2027	September 14, 2022
Problem in accuracy and performance in conversion from keras to tensorrt model for production TensorRT tensorrt , tensorflow	26	2500	July 3, 2021
TensorRT Keras Incorrect Output Jetson AGX Xavier tensorrt	7	579	October 18, 2021
Different results in TensorRT vs TF Keras TensorRT	1	604	August 23, 2019