Meet some problem with --precisionConstraints=obey --layerPrecisions

996049726 · January 13, 2023, 3:23am

Description

convert an ONNX model which is very complex to an tensortrt fp32 model, the outputs of the onnx model and trtfp32 model are the same the commd as follows:
trtexec --onnx=encoder2.onnx --fp16 --saveEngine=encoderfp16.trt --useCudaGraph --verbose --tacticSources=-cublasLt,+cublas --workspace=10240M --minShapes=src_tokens:1x1000 --optShapes=src_tokens:1x100000 --maxShapes=src_tokens:1x700000 --preview=+fasterDynamicShapes0805 >log.en
all the thing is ok
However when I conert the onnx model to an tensort fp16 the output is very different and some weights affected

[01/13/2023-10:07:54] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/13/2023-10:09:43] [W] [TRT] Using kFASTER_DYNAMIC_SHAPES_0805 preview feature.
[01/13/2023-10:20:45] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[01/13/2023-10:20:45] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[01/13/2023-10:20:45] [W] [TRT] Check verbose logs for the list of affected weights.
[01/13/2023-10:20:45] [W] [TRT] - 254 weights are affected by this issue: Detected subnormal FP16 values.
[01/13/2023-10:20:45] [W] [TRT] - 31 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.

detail log in log.en
log.en (33.7 MB)

I want to use –precisionConstraints and --layerPrecisions to restrict some weiths to fp32,the commd as follows:
trtexec --onnx=encoder2.onnx --fp16 --saveEngine=encoderfp16.trt --useCudaGraph --verbose --tacticSources=-cublasLt,+cublas --workspace=10240M --minShapes=src_tokens:1x1000 --optShapes=src_tokens:1x100000 --maxShapes=src_tokens:1x700000 --preview=+fasterDynamicShapes0805 –precisionConstraints=obey --layerPrecisions=Conv_240.weight:fp32, Conv_263.weight:fp32, Conv_286.weight:fp32 >log.en1

But the log output are the same,

[01/13/2023-10:42:38] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/13/2023-10:44:28] [W] [TRT] Using kFASTER_DYNAMIC_SHAPES_0805 preview feature.
[01/13/2023-10:55:24] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[01/13/2023-10:55:24] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[01/13/2023-10:55:24] [W] [TRT] Check verbose logs for the list of affected weights.
[01/13/2023-10:55:24] [W] [TRT] - 254 weights are affected by this issue: Detected subnormal FP16 values.
[01/13/2023-10:55:24] [W] [TRT] - 31 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
[01/13/2023-10:55:26] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See CUDA_MODULE_LOADING in CUDA C++ Programming Guide

detail log in log.en1
log.en1 (33.9 MB)

A clear and concise description of the bug or issue.

Environment

TensorRT Version: TensorRT-8.5.2.2.Linux.x86_64-gnu.cuda-11.8.cudnn8.6.tar.gz
GPU Type: Tesla V100-PCIE
CUDA Version: 11.6
CUDNN Version: cudnn-linux-x86_64-8.6.0.163_cuda11-archive
PyTorch Version (if applicable): 1.12.0
onnx: 1.12.0

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

spolisetty · January 17, 2023, 10:01am

Hi,

This issue looks same as the following. Please follow-up in the same post.

Thank you.

Topic		Replies	Views
255 weights are affected by this issue: Detected subnormal FP16 values TensorRT	1	1323	January 17, 2023
Trtexec --layerPrecision and --precisionConstraints not respected when converting onnx model TensorRT	3	2030	February 20, 2023
TensorRT encountered issues when converting weights between types and that could affect accuracy TensorRT	7	1812	September 22, 2023
TRTExec - Force precision on certain ONNX Op Nodes TensorRT	1	1088	February 23, 2023
Convert the TRT model with FP16 Jetson TX2 jetpack , tensorrt , jetson-inference	7	2509	October 18, 2021
[TensorRT] Cannot use precision Int32 with weights of type Float Deep Learning (Training & Inference)	1	578	May 13, 2020
Can convert to INT32 but not with FP16 TensorRT	3	1057	November 29, 2022
Error on converting ONNX to FP16 TensorRT with my model Deep Learning (Training & Inference)	0	408	August 17, 2020
Unable to convert ONNX model to TensorRT TensorRT tensorrt , pytorch , onnx	6	3521	September 30, 2020
ONNX to TensorRT conversion (FP16 or FP32) results in integer outputs being mapped to near negative infinity (~2e-45) TensorRT tensorrt , cuda , onnx , aws , natural-language-processing-nlp , nlp	3	3416	June 6, 2022

Meet some problem with --precisionConstraints=obey --layerPrecisions

Description

Environment

Relevant Files

Steps To Reproduce

Related topics