Can convert to INT32 but not with FP16

Description

It works fine when converting a int32 onnx model to a int32 tensorrt engine, but it does not work when trying to convert a fp16 onnx model to a fp16 tensorrt engine. Mind you I’m using yolov5 and from the author “I see the exported model is using Cast modules to FP32 here, probaby on grid addition to the outputs. It seems the .half() cast is not affecting the grid/anchor_grid.”

Log when doing int32 conversion

[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
tensorrt_conversion.py:110: DeprecationWarning: Use build_serialized_network instead.
engine = builder.build_engine(network, config)
[11/21/2022-10:46:32] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[11/21/2022-10:46:32] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.

Log when doing int32 conversion

[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
Segmentation fault

Environment

**TensorRT Version = 8.4.3.1 **:
GPU Type = RTX Geforce 3070 Laptop GPU:
Nvidia Driver Version = 517.40:
CUDA Version:
CUDNN Version = 11.7:
Operating System + Version = WSL2 Ubuntu 18.04:
Python Version (if applicable) 3.8.10:

Relevant Files

logger = trt.Logger(trt.Logger.WARNING)
builder = trt.Builder(logger)#Builder object
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, logger)
success = parser.parse_from_file(f"{onnx_file}")
for idx in range(parser.num_errors):
    print(parser.get_error(idx))
if not success:
    raise RuntimeError (f"failed to load ONNX file: {onnx_file}")


config = builder.create_builder_config()#IBuilderConfig object
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 4 << 30) 

if not builder..platform_has_fast_fp16:
    config.set_flag(trt.BuilderFlag.FP16)

engine = builder.build_serialized_network(network, config)
serialized_engine = engine.serialize()

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Hi,
I did this
import sys
import onnx
filename = “yolov5n.onnx”
model = onnx.load(filename)
try:
onnx.checker.check_model(model)
except onnx.checker.ValidationError as e:
print(f"The model is invalid: {e}")
else:
print(“The model is valid!”)
and the model is valid. If it hadn’t been it shouldn’t have been able to be converted into int32.
Could it be that fp16 conversion is much more demanding on RAM, so that wsl2 doesn’t have access to all it needs?

$ free -h giga
total used free shared buff/cache available
Mem: 24Gi 1.0Gi 22Gi 0.0Ki 1.8Gi 23Gi
Swap: 7.0Gi 0B 7.0Gi

Uploaded both the onnx and int32 engine model.
yolov5n.onnx (4.0 MB)
yolov5n.engine (10.6 MB)

Hi,

We could successfully build TRT engine using the following command.

trtexec --onnx=yolov5n.onnx --verbose --fp16 --memPoolSize=workspace:10000

We recommend you to please try on the latest TensorRT 8.5.1 version.
If you still face this issue, please share with us the complete verbose logs.

Thank you.