Thanks for your help. I’m working with @kurkur14 on this.
gives:
----------------------------------------------------------------
Input filename: ./model_name_here.onnx
ONNX IR version: 0.0.7
Opset version: 13
Producer name: tf2onnx
Producer version: 1.8.5
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[07/06/2021-15:20:20] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of 'std::out_of_range'
what(): Attribute not found: axes
Aborted (core dumped)
Netron shows input shape of that .onnx is 1x3x299x299
nvinfer fails with
ERROR: ModelImporter.cpp:472 In function importModel:
[4] Assertion failed: !_importer_ctx.network()->hasImplicitBatchDimension() && "This version of the ONNX parser only supports TensorRT INetworkDefinitions with an explicit batch dimension. Please ensure the network was created using the EXPLICIT_BATCH NetworkDefinitionCreationFlag."
ERROR: Failed to parse onnx file
ERROR: failed to build network since parsing model errors.
ERROR: failed to build network.
I’ve been doing devops stuff recently so haven’t had much time, but I will try this and get back to you. I have permission to share one of the models if needs be.
Not quite yet. Tasked on something else at the moment but I will get back to it soon hopefully. Sorry for holding the issue open. I appreciate the help in any case.
Thanks! I tried an older version of the .onnx converted with keras2onnx and with this flag, trtexec seems satisfied.
----------------------------------------------------------------
Input filename: qa_edge_latest.onnx
ONNX IR version: 0.0.7
Opset version: 12
Producer name: keras2onnx
Producer version: 1.7.0
Domain: onnxmltools
Model version: 0
Doc string:
----------------------------------------------------------------
[08/19/2021-13:27:10] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/19/2021-13:27:10] [W] Dynamic dimensions required for input: input_1, but no shapes were provided. Automatically overriding shape to: 1x299x299x3
[08/19/2021-13:27:34] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[08/19/2021-13:29:00] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[08/19/2021-13:29:03] [I] Starting inference threads
[08/19/2021-13:29:07] [I] Warmup completed 0 queries over 200 ms
[08/19/2021-13:29:07] [I] Timing trace has 0 queries over 3.11609 s
[08/19/2021-13:29:07] [I] Trace averages of 10 runs:
[08/19/2021-13:29:07] [I] Average on 10 runs - GPU latency: 122.669 ms - Host latency: 122.833 ms (end to end 122.853 ms, enqueue 54.9077 ms)
[08/19/2021-13:29:07] [I] Average on 10 runs - GPU latency: 81.9146 ms - Host latency: 82.0219 ms (end to end 82.0319 ms, enqueue 11.799 ms)
[08/19/2021-13:29:07] [I] Average on 10 runs - GPU latency: 81.9658 ms - Host latency: 82.0742 ms (end to end 82.0847 ms, enqueue 10.4944 ms)
[08/19/2021-13:29:07] [I] Host Latency
[08/19/2021-13:29:07] [I] min: 81.8126 ms (end to end 81.8221 ms)
[08/19/2021-13:29:07] [I] max: 457.511 ms (end to end 457.617 ms)
[08/19/2021-13:29:07] [I] mean: 94.4128 ms (end to end 94.4261 ms)
[08/19/2021-13:29:07] [I] median: 82.0251 ms (end to end 82.0359 ms)
[08/19/2021-13:29:07] [I] percentile: 457.511 ms at 99% (end to end 457.617 ms at 99%)
[08/19/2021-13:29:07] [I] throughput: 0 qps
[08/19/2021-13:29:07] [I] walltime: 3.11609 s
[08/19/2021-13:29:07] [I] Enqueue Time
[08/19/2021-13:29:07] [I] min: 4.32751 ms
[08/19/2021-13:29:07] [I] max: 457.194 ms
[08/19/2021-13:29:07] [I] median: 11.614 ms
[08/19/2021-13:29:07] [I] GPU Compute
[08/19/2021-13:29:07] [I] min: 81.7048 ms
[08/19/2021-13:29:07] [I] max: 456.928 ms
[08/19/2021-13:29:07] [I] mean: 94.2882 ms
[08/19/2021-13:29:07] [I] median: 81.9185 ms
[08/19/2021-13:29:07] [I] percentile: 456.928 ms at 99%
[08/19/2021-13:29:07] [I] total compute time: 3.11151 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --explicitBatch --onnx=qa_edge_latest.onnx
I will try with DeepStream next. Is there an equivalent option for nvinfer?
Ok. Seems I’m in the same situation my coworker was. The model in the post directly above was converted using keras2onnx and is in format nhwc. DeepStream did not like that, so I converted to .onnx using tf2onnx using more or less the same incantation as my coworker:
When I try to trtexecthat model which, according to netron has input shape 1x3x299x299 I get:
...
----------------------------------------------------------------
Input filename: qa_edge_latest.onnx
ONNX IR version: 0.0.7
Opset version: 13
Producer name: tf2onnx
Producer version: 1.9.1
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[08/19/2021-16:31:33] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of 'std::out_of_range'
what(): Attribute not found: axes
Aborted
In this case, --explicitBatch does not help.
I am authorized to share one of the models for the purpose of resolving this. What’s the best way to do that?
Could you share the original Keras model as well as the NCHW and NHWC ONNX model?
You can attach the file directly on this topic or share a DRIVE link with us.
If the public release is a concern, you can share it via private message.
I have successfully converted a keras model to a tensorrt model and executed it using the autonomous driving platform donkeycar.
It may be helpful, so I will write the how to.
【Convert h5 to engine】
The points:
Use tf2onnx
Check inputs of the onnx model using netron
Use --shapes options with trtexec
#https://github.com/onnx/tensorflow-onnx
#pip install -U tf2onnx
#python convert_h5_to_engine.py
########################################
# h5 to onnx
########################################
import tensorflow as tf
import onnx
import tf2onnx.convert
from tensorflow.keras.models import Model, load_model
model = load_model('linear.h5')
#model.save('linear.pb')
onnx_model, _ = tf2onnx.convert.from_keras(model)
onnx.save(onnx_model, 'linear.onnx')
########################################
# onnx to engine (sh command)
########################################
import os
command = f'/usr/src/tensorrt/bin/trtexec --onnx=linear.onnx --saveEngine=linear.engine --fp16 --shapes=img_in:1x120x160x3 --explicitBatch'
os.system(command)
code:
【Inference】
The points:
I don’t need to convert from HWC to CHW. Use HWC format.
def preprocess(self, image):
# image: rgb image
# RGB convert from [0, 255] to [0.0, 1.0]
x = image.astype(np.float32) / 255.0
# HWC to CHW format
#x = x.transpose((2, 0, 1)) # keras -> ONNX -> TRT8, don't need HWC to CHW. model inputs uses HWC.
# Flatten it to a 1D array.
x = x.reshape(-1)
#x = x.ravel()
return x
code:
DEMO
Requirements:
・Jetson (AGX Xavier or Nano 4GB or 2GB)
・JetPack 4.6
・PC (Ubuntu or Mac or Windows)
wget https://github.com/tawnkramer/gym-donkeycar/releases/download/v21.07.24/DonkeySimLinux.zip
unzip DonkeySimLinux.zip
cd DonkeySimLinux
chmod 755 donkey_sim.x86_64
./donkey_sim.x86_64
【Jetson】
# on Jetson
# Launch donkeycar docker
wget https://raw.githubusercontent.com/naisy/overdrive/master/docker/run-donkeycar-jetson.sh
chmod 755 run-donkeycar-jetson.sh
sudo su
./run-donkeycar-jetson.sh
# on Docker container
# Download donkeycar linear model
gdown https://drive.google.com/uc?id=1e7HMUoUfvDzUxK8pvAKntp_HkkoiEEVa -O linear.h5
cp ~/projects/donkeycar_tools/convert_h5_to_engine.py ./
cp ~/projects/donkeycar_tools/racer.py ./
python convert_h5_to_engine.py
# wait for the model to be created
# use pc address for host. (simulator server's ip address)
python racer.py --host=192.168.0.xxx --name=naisy --model=linear.engine --delay=0.1
If the vehicle crashes into the inner cone, change the delay to 0.11.
If the vehicle overrun, change the delay to 0.09
【Youtube】 I’m sorry, I haven’t taken the TensorRT version of the video.
However, as with this video, autonomous driving will be possible.
Yes, we do this in Python and it works, but it’s kinda silly in our case since there’s a transpose node from NHWC to NCHW inside the network itself (so two useless transposes).
We’re going Keras → SavedModel format → tf2onnx.convert…
Please check your PMs. I sent the model a while ago. Perhaps a notification didn’t get sent out. Client does not want to share the Keras model but I did send the .onnx versions.
Suggestion is not suitable to DeepStream pipeline. Also input format not the issue. Transpose issue is resolved. Issue now is std::out_of_range in TensorRT. Please review above log and shared .onnx model I sent via PM.