Platform : Jetson Xavier NX
Jetpack Version: 4.4.1
TensorRT Version: 7.1.3
ONNX Version: 1.10.2
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_21:14:42_PDT_2019
Cuda compilation tools, release 10.2, V10.2.8
I have compared the outputs between my ONNX model
and pure GPU trt model
, they are almost the same. but when I come to my pure GPU trt model
and trt with DLA model
the outputs seem to have a large gap between them. And also I tried the methods mentioned in the previous forum. I already set the input shape of each layer to be 2^n, but the results are still bad. Jetpack 4.6 works fine, but for some reason, we have to use 4.4 for now, is there any ways to resolve this?
# onnx conversion
python -m tf2onnx.convert --opset 11 --input test_frozen_mobilenet.pb --inputs input_1:0 --outputs Identity:0 --output test_frozen_mobilenet.onnx
# trt conversion
./trtexec --onnx=test_64.onnx --fp16 --useDLACore=1 --saveEngine=test_64.trt --verbose --allowGPUFallback
and I am using python to run the inference and compare the outputs
compare_output.py (4.7 KB)
python3 compare_output.py --onnx test_64.onnx --trt test_64.trt --size 64
sample output
====== batch 0 ======
dla : [[[[0.54248047]]]. [[[0.54589844]]]]
onnx : [[0.48507023]. [0.49452645]]
====== batch 1 ======
dla : [[[[0.54589844]]]. [[[0.5371094 ]]]]
onnx : [[0.49452645]. [0.48755872]]
test_64.onnx (455.2 KB)
test_64.trt (1.3 MB)