Hi,
I retrained the detect model (ssd-mobilenet-v1) with my own dataset and successfully converted it to ONNX and now trying to run it on Jetson Nano 4G kit.
I used this command:
detectnet --model=models/Cdu/ssd-mobilenet.onnx --labels=models/Cdu/labels.txt --input-blog=input_0 --output-cvg=scores --output-bbox=boxes “data/cdu_640_480.mp4” data/test.mp4
gstreamer] initialized gstreamer, version 1.14.5.0
[gstreamer] gstDecoder – creating decoder for data/cdu_640_480.mp4
Opening in BLOCKING MODE
[gstreamer] gstDecoder – discovered video resolution: 640x480 (framerate 29.750000 Hz)
[gstreamer] gstDecoder – discovered video caps: video/x-h264, stream-format=(string)byte-stream, alignment=(string)au, level=(string)3, profile=(string)high, width=(int)640, height=(int)480, framerate=(fraction)119/4, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, chroma-format=(string)4:2:0, bit-depth-luma=(uint)8, bit-depth-chroma=(uint)8, parsed=(boolean)true
[gstreamer] gstDecoder – pipeline string:
[gstreamer] filesrc location=data/cdu_640_480.mp4 ! qtdemux ! queue ! h264parse ! omxh264dec ! video/x-raw(memory:NVMM) ! appsink name=mysink
e[0;32m[video] created gstDecoder from file:///home/jetson/jetson-inference/python/training/detection/ssd/data/cdu_640_480.mp4
e[0m------------------------------------------------
gstDecoder video options:
– URI: file:///home/jetson/jetson-inference/python/training/detection/ssd/data/cdu_640_480.mp4
- protocol: file
- location: data/cdu_640_480.mp4
- extension: mp4
– deviceType: file
– ioType: input
– codec: h264
– width: 640
– height: 480
– frameRate: 29.750000
– bitRate: 0
– numBuffers: 4
– zeroCopy: true
– flipMethod: none
– loop: 0
– rtspLatency 2000
e[0;33m[gstreamer] gstEncoder – codec not specified, defaulting to H.264
e[0m[gstreamer] gstEncoder – pipeline launch string:
[gstreamer] appsrc name=mysource is-live=true do-timestamp=true format=3 ! omxh264enc bitrate=4000000 ! video/x-h264 ! h264parse ! qtmux ! filesink location=data/test.mp4
e[0;32m[video] created gstEncoder from file:///home/jetson/jetson-inference/python/training/detection/ssd/data/test.mp4
e[0m------------------------------------------------
gstEncoder video options:
– URI: file:///home/jetson/jetson-inference/python/training/detection/ssd/data/test.mp4 - protocol: file
- location: data/test.mp4
- extension: mp4
– deviceType: file
– ioType: output
– codec: h264
– width: 0
– height: 0
– frameRate: 30.000000
– bitRate: 4000000
– numBuffers: 4
– zeroCopy: true
– flipMethod: none
– loop: 0
– rtspLatency 2000
[OpenGL] glDisplay – X screen 0 resolution: 1920x1080
[OpenGL] glDisplay – X window resolution: 1920x1080
[OpenGL] glDisplay – display device initialized (1920x1080)
e[0;32m[video] created glDisplay from display://0
e[0m------------------------------------------------
glDisplay video options:
– URI: display://0 - protocol: display
- location: 0
– deviceType: display
– ioType: output
– codec: raw
– width: 1920
– height: 1080
– frameRate: 0.000000
– bitRate: 0
– numBuffers: 4
– zeroCopy: true
– flipMethod: none
– loop: 0
– rtspLatency 2000
detectNet – loading detection network model from:
– prototxt NULL
– model models/Cdu/ssd-mobilenet.onnx
– input_blob ‘data’
– output_cvg ‘scores’
– output_bbox ‘boxes’
– mean_pixel 0.000000
– class_labels models/Cdu/labels.txt
– class_colors NULL
– threshold 0.500000
– batch_size 1
[TRT] TensorRT version 8.0.1
[TRT] loading NVIDIA plugins…
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::ScatterND version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
e[0;31m[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
e[0m[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
e[0;33m[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
e[0m[TRT] [MemUsageChange] Init CUDA: CPU +197, GPU +0, now: CPU 230, GPU 2693 (MiB)
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] found engine cache file models/Cdu/ssd-mobilenet.onnx.1.1.8001.GPU.FP16.engine
[TRT] found model checksum models/Cdu/ssd-mobilenet.onnx.sha256sum
[TRT] echo “$(cat models/Cdu/ssd-mobilenet.onnx.sha256sum) models/Cdu/ssd-mobilenet.onnx” | sha256sum --check --status
[TRT] model matched checksum models/Cdu/ssd-mobilenet.onnx.sha256sum
[TRT] loading network plan from engine cache… models/Cdu/ssd-mobilenet.onnx.1.1.8001.GPU.FP16.engine
e[0;32m[TRT] device GPU, loaded models/Cdu/ssd-mobilenet.onnx
e[0m[TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 252, GPU 2716 (MiB)
[TRT] Loaded engine size: 21 MB
[TRT] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 252 MiB, GPU 2716 MiB
[TRT] Using cublas a tactic source
[TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +158, GPU +158, now: CPU 411, GPU 2874 (MiB)
[TRT] Using cuDNN as a tactic source
[TRT] [MemUsageChange] Init cuDNN: CPU +241, GPU +242, now: CPU 652, GPU 3116 (MiB)
[TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 651, GPU 3116 (MiB)
[TRT] Deserialization required 3515873 microseconds.
[TRT] [MemUsageSnapshot] deserializeCudaEngine end: CPU 651 MiB, GPU 3116 MiB
[TRT] [MemUsageSnapshot] ExecutionContext creation begin: CPU 651 MiB, GPU 3116 MiB
[TRT] Using cublas a tactic source
[TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 651, GPU 3116 (MiB)
[TRT] Using cuDNN as a tactic source
[TRT] [MemUsageChange] Init cuDNN: CPU +1, GPU +0, now: CPU 652, GPU 3116 (MiB)
[TRT] Total per-runner device memory is 14915584
[TRT] Total per-runner host memory is 73456
[TRT] Allocated activation device memory of size 9528832
[TRT] [MemUsageSnapshot] ExecutionContext creation end: CPU 654 MiB, GPU 3119 MiB
[TRT]
[TRT] CUDA engine context initialized on device GPU:
[TRT] – layers 103
[TRT] – maxBatchSize 1
[TRT] – deviceMemory 9528832
[TRT] – bindings 3
[TRT] binding 0
– index 0
– name ‘input_0’
– type FP32
– in/out INPUT
– # dims 4
– dim #0 1
– dim #1 3
– dim #2 300
– dim #3 300
[TRT] binding 1
– index 1
– name ‘scores’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 1
– dim #1 3000
– dim #2 22
[TRT] binding 2
– index 2
– name ‘boxes’
– type FP32
– in/out OUTPUT
– # dims 3
– dim #0 1
– dim #1 3000
– dim #2 4
[TRT]
e[0;31m[TRT] 3: Cannot find binding of given name: data
e[0me[0;31m[TRT] failed to find requested input layer data in network
e[0me[0;31m[TRT] device GPU, failed to create resources for CUDA engine
e[0me[0;31m[TRT] failed to create TensorRT engine for models/Cdu/ssd-mobilenet.onnx, device GPU
e[0me[0;31m[TRT] detectNet – failed to initialize.
e[0me[0;31mdetectnet: failed to load detectNet model