Failed to convert to tensorrt engine for yolov4 model trained in TAO

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) RTX 3080
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) YOLOV4
I have a model trained using Yolov4. The spec file is as follow.
experiment_spec.txt (2.1 KB)

Yolov4 model is used to train for customized data.
The model is exported to onnx using the following command.

yolo_v4 export \
  -m yolov4/models/yolov4_resnet18_epoch_280.tlt \
  -k yolov4 \
  -o yolov4/models/yolov4_resnet18_epoch_280.onnx \
  -e yolov4/specs/experiment_spec.txt

After that, the onnx file is converted to engine using the following two commands.

Using deepstream
trtexec --onnx=./models/yolov4_new/yolov4_resnet18_epoch_280.onnx --int8 --calib=./models/yolov4/cal_trt861.bin --saveEngine=./models/yolov4_new/1/yolov4_resnet18_epoch_280.onnx_b4_gpu0_int8.engine --minShapes=Input:1x3x544x960 --optShapes=Input:2x3x544x960 --maxShapes=Input:4x3x544x960
Errors:

[07/02/2024-03:05:54] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 22, GPU 626 (MiB)
[07/02/2024-03:05:59] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1444, GPU +268, now: CPU 1542, GPU 886 (MiB)
[07/02/2024-03:05:59] [I] Start parsing network model.
[07/02/2024-03:05:59] [I] [TRT] ----------------------------------------------------------------
[07/02/2024-03:05:59] [I] [TRT] Input filename:   ./models/yolov4_new/yolov4_resnet18_epoch_280.etlt
[07/02/2024-03:05:59] [I] [TRT] ONNX IR version:  0.0.0
[07/02/2024-03:05:59] [I] [TRT] Opset version:    0
[07/02/2024-03:05:59] [I] [TRT] Producer name:    
[07/02/2024-03:05:59] [I] [TRT] Producer version: 
[07/02/2024-03:05:59] [I] [TRT] Domain:           
[07/02/2024-03:05:59] [I] [TRT] Model version:    0
[07/02/2024-03:05:59] [I] [TRT] Doc string:       
[07/02/2024-03:05:59] [I] [TRT] ----------------------------------------------------------------
[07/02/2024-03:05:59] [I] Finished parsing network model. Parse time: 0.0786991
[07/02/2024-03:05:59] [E] Cannot find input tensor with name "Input" in the network inputs! Please make sure the input tensor names are correct.
[07/02/2024-03:05:59] [E] Network And Config setup failed
[07/02/2024-03:05:59] [E] Building engine failed
[07/02/2024-03:05:59] [E] Failed to create engine from model or file.
[07/02/2024-03:05:59] [E] Engine set up failed

Using nvcr.io/nvidia/tao/tao-toolkit:5.3.0-deploy docker
yolo_v4 gen_trt_engine -m /workspace/yolov4/models/yolov4_resnet18_epoch_280.onnx -e /workspace/yolov4/specs/experiment_spec.txt -r /workspace/yolov4/export/ -k yolov4 --engine_file /workspace/yolov4/export/int8.engine
Errors:

2024-07-02 06:19:11,393 [TAO Toolkit] [INFO] nvidia_tao_deploy.cv.common.logging.status_logging 198: Log file already exists at /workspace/yolov4/export/status.json
2024-07-02 06:19:11,394 [TAO Toolkit] [INFO] root 174: Starting yolo_v4 gen_trt_engine.
[07/02/2024-06:19:11] [TRT] [I] [MemUsageChange] Init CUDA: CPU +1, GPU +0, now: CPU 34, GPU 821 (MiB)
[07/02/2024-06:19:16] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1444, GPU +268, now: CPU 1555, GPU 1089 (MiB)
2024-07-02 06:19:16,798 [TAO Toolkit] [INFO] nvidia_tao_deploy.cv.yolo_v3.engine_builder 79: Parsing ONNX model
/usr/local/lib/python3.10/dist-packages/onnx/serialization.py:118: RuntimeWarning: Unexpected end-group tag: Not all data was converted
  decoded = typing.cast(Optional[int], proto.ParseFromString(serialized))
2024-07-02 06:19:16,987 [TAO Toolkit] [INFO] root 174: Protobuf decoding consumed too few bytes: 1 out of 139881234
Traceback (most recent call last):
  File "</usr/local/lib/python3.10/dist-packages/nvidia_tao_deploy/cv/yolo_v4/scripts/gen_trt_engine.py>", line 3, in <module>
  File "<frozen cv.yolo_v4.scripts.gen_trt_engine>", line -1, in <module>
  File "<frozen cv.common.decorators>", line 63, in _func
  File "<frozen cv.common.decorators>", line 47, in _func
  File "<frozen cv.yolo_v4.scripts.gen_trt_engine>", line 69, in main
  File "<frozen cv.yolo_v3.engine_builder>", line 79, in create_network
  File "<frozen cv.yolo_v3.engine_builder>", line 61, in get_onnx_input_dims
  File "/usr/local/lib/python3.10/dist-packages/onnx/__init__.py", line 208, in load_model
    model = _get_serializer(format, f).deserialize_proto(_load_bytes(f), ModelProto())
  File "/usr/local/lib/python3.10/dist-packages/onnx/serialization.py", line 120, in deserialize_proto
    raise google.protobuf.message.DecodeError(
google.protobuf.message.DecodeError: Protobuf decoding consumed too few bytes: 1 out of 139881234
2024-07-02 06:19:17,200 [TAO Toolkit] [INFO] nvidia_tao_deploy.cv.common.entrypoint.entrypoint_proto: Sending telemetry data.
2024-07-02 06:19:20,184 [TAO Toolkit] [INFO] nvidia_tao_deploy.cv.common.entrypoint.entrypoint_proto: Execution status: FAIL

The onnx model is attached here

I cannot open your onnx file with Netron. Could you open it successfully?

I also can’t open. What could be wrong with it? It is trained with yolov4.

Can you double check if the .tlt file can work? For example, run evaluation with it.
Then, can you export to onnx file again? Please also double check if the key is correct.
More, in latest TAO, the trained result will be an hdf5 file. I am afraid you are using an old version of TAO. If that is the case, please follow the old TAO guide.
NVIDIA TAO - NVIDIA Docs
For example, YOLOv4 - NVIDIA Docs.
I am afraid your exported model is actually an etlt file.

Thanks it worked.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.