Hi, all,
I have a difficulty to convert the customized trained the tiny_yolov2 model into tensorrt format. So I plan to test the tiny_yolov2.onnx model from
https:/github.com/onnx/models/tree/master/vision/object_detection_segmentation/tiny_yolov2
I create a “test” folder inside of “jetson-inference/data/networks/test/” and I put the onnx model inside together with the labels file;
Then, in the folder “/jetson-inference/python/examples”, I run the following command:
$ NET==/home/***/jetson-inference/data/networks/test/
$ python3 detectnet-console.py --model=$NET/Model.onnx --label=$NET/labels.txt /home/***/jetson-inference/data/images/peds_1.jpg /home/***/output.jpg --input_blob=data –output_bbox=bboxes
But I get following error:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
jetson.inference.init.py
jetson.inference – initializing Python 3.6 bindings…
jetson.inference – registering module types…
jetson.inference – done registering module types
jetson.inference – done Python 3.6 binding initialization
jetson.utils.init.py
jetson.utils – initializing Python 3.6 bindings…
jetson.utils – registering module functions…
jetson.utils – done registering module functions
jetson.utils – registering module types…
jetson.utils – done registering module types
jetson.utils – done Python 3.6 binding initialization
[image] loaded ‘/home//jetson-inference/data/images/peds_1.jpg’ (1920 x 1080, 3 channels)
jetson.inference – PyTensorNet_New()
jetson.inference – PyDetectNet_Init()
jetson.inference – detectNet loading network using argv command line params
jetson.inference – detectNet.init() argv[0] = ‘detectnet-console.py’
jetson.inference – detectNet.init() argv[1] = '–model=/home//jetson-inference/data/networks/test/Model.onnx’
jetson.inference – detectNet.init() argv[2] = ‘–label=/home//jetson-inference/data/networks/test/labels.txt’
jetson.inference – detectNet.init() argv[3] = '/home//jetson-inference/data/images/peds_1.jpg’
jetson.inference – detectNet.init() argv[4] = ‘/home/***/output.jpg’
jetson.inference – detectNet.init() argv[5] = ‘–input_blob=data’
jetson.inference – detectNet.init() argv[6] = ‘–output_bbox=bboxes’
detectNet – loading detection network model from:
– prototxt NULL
– model /home/***/jetson-inference/data/networks/test/Model.onnx
– input_blob ‘data’
– output_cvg ‘coverage’
– output_bbox ‘bboxes’
– mean_pixel 0.000000
– mean_binary NULL
– class_labels NULL
– threshold 0.500000
– batch_size 1
[TRT] TensorRT version 5.1.6
[TRT] loading NVIDIA plugins…
[TRT] Plugin Creator registration succeeded - GridAnchor_TRT
[TRT] Plugin Creator registration succeeded - NMS_TRT
[TRT] Plugin Creator registration succeeded - Reorg_TRT
[TRT] Plugin Creator registration succeeded - Region_TRT
[TRT] Plugin Creator registration succeeded - Clip_TRT
[TRT] Plugin Creator registration succeeded - LReLU_TRT
[TRT] Plugin Creator registration succeeded - PriorBox_TRT
[TRT] Plugin Creator registration succeeded - Normalize_TRT
[TRT] Plugin Creator registration succeeded - RPROI_TRT
[TRT] Plugin Creator registration succeeded - BatchedNMS_TRT
[TRT] completed loading NVIDIA plugins.
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home//jetson-inference/data/networks/test/Model.onnx.1.1.GPU.FP16.engine
[TRT] cache file not found, profiling network model on device GPU
[TRT] device GPU, loading /usr/bin/ /home//jetson-inference/data/networks/test/Model.onnx
Input filename: /home/***/jetson-inference/data/networks/test/Model.onnx
ONNX IR version: 0.0.5
Opset version: 8
Producer name: OnnxMLTools
Producer version: 1.5.2
Domain: onnxconverter-common
Model version: 0
Doc string: The Tiny YOLO network from the paper ‘YOLO9000: Better, Faster, Stronger’ (2016), arXiv:1612.08242
WARNING: ONNX model has a newer ir_version (0.0.5) than this parser was built against (0.0.3).
[TRT] scalerPreprocessor_scaled:Mul → (3, 416, 416)
[TRT] image2:Add → (3, 416, 416)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (3, 416, 416)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 16
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (16, 416, 416)
[TRT] convolution2d_1_output:Conv → (16, 416, 416)
[TRT] batchnormalization_1_output:BatchNormalization → (16, 416, 416)
[TRT] leakyrelu_1_output:LeakyRelu → (16, 416, 416)
[TRT] maxpooling2d_1_output:MaxPool → (16, 208, 208)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (16, 208, 208)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 32
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (32, 208, 208)
[TRT] convolution2d_2_output:Conv → (32, 208, 208)
[TRT] batchnormalization_2_output:BatchNormalization → (32, 208, 208)
[TRT] leakyrelu_2_output:LeakyRelu → (32, 208, 208)
[TRT] maxpooling2d_2_output:MaxPool → (32, 104, 104)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (32, 104, 104)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 64
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (64, 104, 104)
[TRT] convolution2d_3_output:Conv → (64, 104, 104)
[TRT] batchnormalization_3_output:BatchNormalization → (64, 104, 104)
[TRT] leakyrelu_3_output:LeakyRelu → (64, 104, 104)
[TRT] maxpooling2d_3_output:MaxPool → (64, 52, 52)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (64, 52, 52)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 128
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (128, 52, 52)
[TRT] convolution2d_4_output:Conv → (128, 52, 52)
[TRT] batchnormalization_4_output:BatchNormalization → (128, 52, 52)
[TRT] leakyrelu_4_output:LeakyRelu → (128, 52, 52)
[TRT] maxpooling2d_4_output:MaxPool → (128, 26, 26)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (128, 26, 26)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 256
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (256, 26, 26)
[TRT] convolution2d_5_output:Conv → (256, 26, 26)
[TRT] batchnormalization_5_output:BatchNormalization → (256, 26, 26)
[TRT] leakyrelu_5_output:LeakyRelu → (256, 26, 26)
[TRT] maxpooling2d_5_output:MaxPool → (256, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (256, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 512
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (512, 13, 13)
[TRT] convolution2d_6_output:Conv → (512, 13, 13)
[TRT] batchnormalization_6_output:BatchNormalization → (512, 13, 13)
[TRT] leakyrelu_6_output:LeakyRelu → (512, 13, 13)
[TRT] maxpooling2d_6_output:MaxPool → (512, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (512, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 1024
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (1024, 13, 13)
[TRT] convolution2d_7_output:Conv → (1024, 13, 13)
[TRT] batchnormalization_7_output:BatchNormalization → (1024, 13, 13)
[TRT] leakyrelu_7_output:LeakyRelu → (1024, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (1024, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (3, 3), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 1024
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (1024, 13, 13)
[TRT] convolution2d_8_output:Conv → (1024, 13, 13)
[TRT] batchnormalization_8_output:BatchNormalization → (1024, 13, 13)
[TRT] leakyrelu_8_output:LeakyRelu → (1024, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:771: Convolution input dimensions: (1024, 13, 13)
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:835: Using kernel: (1, 1), strides: (1, 1), padding: (0, 0), dilations: (1, 1), numOutputs: 125
[TRT] /home/erisuser/p4sw/sw/gpgpu/MachineLearning/DIT/release/5.1/parsers/onnxOpenSource/builtin_op_importers.cpp:836: Convolution output dimensions: (125, 13, 13)
[TRT] grid:Conv → (125, 13, 13)
[TRT] retrieved Input tensor “image”: 3x416x416
[TRT] device GPU, configuring CUDA engine
[TRT] device GPU, building FP16: ON
[TRT] device GPU, building INT8: OFF
[TRT] device GPU, building CUDA engine (this may take a few minutes the first time a network is loaded)
[TRT] device GPU, completed building CUDA engine
[TRT] network profiling complete, writing engine cache to /home//jetson-inference/data/networks/test/Model.onnx.1.1.GPU.FP16.engine
[TRT] device GPU, completed writing engine cache to /home//jetson-inference/data/networks/test/Model.onnx.1.1.GPU.FP16.engine
[TRT] device GPU, /home/***/jetson-inference/data/networks/test/Model.onnx loaded
[TRT] device GPU, CUDA engine context initialized with 2 bindings
[TRT] binding – index 0
– name ‘image’
– type FP32
– in/out INPUT
– # dims 3
– dim #0 3 (CHANNEL)
– dim #1 416 (SPATIAL)
– dim #2 416 (SPATIAL)
[TRT] binding – index 1
– name ‘grid’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 125 (CHANNEL)
– dim #1 13 (SPATIAL)
– dim #2 13 (SPATIAL)
[TRT] binding to input 0 data binding index: -1
[TRT] binding to input 0 data dims (b=1 c=0 h=0 w=0) size=0
[TRT] failed to alloc CUDA mapped memory for tensor input, 0 bytes
detectNet – failed to initialize.
jetson.inference – detectNet failed to load built-in network ‘ssd-mobilenet-v2’
PyTensorNet_Dealloc()
Traceback (most recent call last):
File “detectnet-console.py”, line 51, in
net = jetson.inference.detectNet(opt.network, sys.argv, opt.threshold)
Exception: jetson.inference – detectNet failed to load network
jetson.utils – freeing CUDA mapped memory
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
II flashed the Jetson TX2 with latest sdkmanager, i.e.: “sdkmanager_0.9.14-4964_amd64.deb”.
Jetson: Jetson TX2, Pascal GPU with 256 CUDA-cores; 64-bit NVIDIA Denver and ARM Cortex-A57 CPUs; 8 GB LPDDR4 Memory; 32GB eMMC 5.1 Flash Storage; Graphics: NVIDIA Tegra X2 (nvgpu)/integrated,
OS: Ubuntu 18.04 LTS, 64-bit
TensorRT version: 5.1.6.1-1+cuda10.0
Python version: Python 3.6.8
Please help me to figure out this issue.
Thank you!