Can convert to INT32 but not with FP16

user36965 · November 21, 2022, 10:22am

Description

It works fine when converting a int32 onnx model to a int32 tensorrt engine, but it does not work when trying to convert a fp16 onnx model to a fp16 tensorrt engine. Mind you I’m using yolov5 and from the author “I see the exported model is using Cast modules to FP32 here, probaby on grid addition to the outputs. It seems the .half() cast is not affecting the grid/anchor_grid.”

Log when doing int32 conversion

[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-10:44:19] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
tensorrt_conversion.py:110: DeprecationWarning: Use build_serialized_network instead.
engine = builder.build_engine(network, config)
[11/21/2022-10:46:32] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[11/21/2022-10:46:32] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.

Log when doing int32 conversion

[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
[11/21/2022-11:09:55] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped
Segmentation fault

Environment

**TensorRT Version = 8.4.3.1 **:
GPU Type = RTX Geforce 3070 Laptop GPU:
Nvidia Driver Version = 517.40:
CUDA Version:
CUDNN Version = 11.7:
Operating System + Version = WSL2 Ubuntu 18.04:
Python Version (if applicable) 3.8.10:

Relevant Files

logger = trt.Logger(trt.Logger.WARNING)
builder = trt.Builder(logger)#Builder object
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, logger)
success = parser.parse_from_file(f"{onnx_file}")
for idx in range(parser.num_errors):
    print(parser.get_error(idx))
if not success:
    raise RuntimeError (f"failed to load ONNX file: {onnx_file}")


config = builder.create_builder_config()#IBuilderConfig object
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 4 << 30) 

if not builder..platform_has_fast_fp16:
    config.set_flag(trt.BuilderFlag.FP16)

engine = builder.build_serialized_network(network, config)
serialized_engine = engine.serialize()

NVES · November 21, 2022, 10:37am

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

user36965 · November 21, 2022, 10:45am

Hi,
I did this
import sys
import onnx
filename = “yolov5n.onnx”
model = onnx.load(filename)
try:
onnx.checker.check_model(model)
except onnx.checker.ValidationError as e:
print(f"The model is invalid: {e}")
else:
print(“The model is valid!”)
and the model is valid. If it hadn’t been it shouldn’t have been able to be converted into int32.
Could it be that fp16 conversion is much more demanding on RAM, so that wsl2 doesn’t have access to all it needs?

$ free -h giga
total used free shared buff/cache available
Mem: 24Gi 1.0Gi 22Gi 0.0Ki 1.8Gi 23Gi
Swap: 7.0Gi 0B 7.0Gi

Uploaded both the onnx and int32 engine model.
yolov5n.onnx (4.0 MB)
yolov5n.engine (10.6 MB)

spolisetty · November 29, 2022, 6:36am

Hi,

We could successfully build TRT engine using the following command.

trtexec --onnx=yolov5n.onnx --verbose --fp16 --memPoolSize=workspace:10000

We recommend you to please try on the latest TensorRT 8.5.1 version.
If you still face this issue, please share with us the complete verbose logs.

Thank you.

Topic		Replies	Views
Inference fp16 engine in c++ get Nan output but inference fp32 engine can get correct result TensorRT	13	1329	October 10, 2023
Onnx model to TRT conversion error TensorRT	6	3318	April 15, 2022
ONNX Model Int64 Weights TensorRT	12	13399	February 17, 2024
Convert the TRT model with FP16 Jetson TX2 jetpack , tensorrt , jetson-inference	7	2459	October 18, 2021
ONNX to TRT Engine conversion Error TensorRT tensorrt	8	3717	May 25, 2022
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	797	December 8, 2021
Converting onnx to trt: [8] No importer registered for op: OneHot TensorRT tensorrt	3	2971	February 2, 2021
Error on converting ONNX to FP16 TensorRT with my model Deep Learning (Training & Inference)	0	405	August 17, 2020
Converted model is broken if half precision with dynamic batch size and batch size is greater than 1 TensorRT	11	2406	October 18, 2024
Tensorrt FP16 conversion issue TensorRT tensorrt , cuda , gstreamer , onnx , deep-learning , deepstream	8	2514	March 6, 2023

Can convert to INT32 but not with FP16

Description

Log when doing int32 conversion

Log when doing int32 conversion

Environment

Relevant Files

check_model.py

Related topics