Description
while i tried to transforms vits(vits/models.py at 2e561ba58618d021b5b8323d3765880f7e0ecfdb · jaywalnut310/vits · GitHub) onnx model to tensorrt end to end, i meet the error:
[04/15/2025-06:31:23] [TRT] [V] =============== Computing costs for /_enc/flow/flows.6/enc/ffn_layers.0/conv_1/Conv
[04/15/2025-06:31:23] [TRT] [V] *************** Autotuning format combination: Float((MUL_ADD 192 E1 768),E0,E0,1) where E0=(+ E1 4) E1=(MIN 9223372036854775807 (MAX 1 (# 0 (VALUE /_enc/ReduceSum_output_0)))) -> Float((* 768 E0),E0,E0,1) where E0=(MIN 9223372036854775807 (MAX 1 (# 0 (VALUE /_enc/ReduceSum_output_0)))) where E0=(+ E1 4) E1=(MIN 9223372036854775807 (MAX 1 (# 0 (VALUE /_enc/ReduceSum_output_0)))) ***************
[04/15/2025-06:31:23] [TRT] [E] [convBaseBuilder.cpp::createConvolution::200] Error Code 2: Internal Error (Assertion isOpConsistent(convolution.get()) failed. Cask convolution isConsistent check failed.)
i try to find the bug according to the issure(Assertion bound >= 0 failed of TensorRT 8.6.1 when running build_serialized_network on GPU nvidia tesla v100 · Issue #3639 · NVIDIA/TensorRT · GitHub), and using the polygraphy run <model.onnx> --onnxrt
to validate the onnx file, but found any clue. the polygraphy
output:
[I] RUNNING | Command: /dh-nas-dev/tujiexu/miniconda3/envs/bert-vits2/bin/polygraphy run /dh-nas-dev/tujiexu/workplace/code/online/deployment/ckpts/bert_vitst/dianx_model_male_00_01_250308/G_model_mix_nolength.onnx --onnxrt
[I] onnxrt-runner-N0-04/15/25-06:28:54 | Activating and starting inference
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[W] Input tensor: x [shape=BoundedShape([1, 'text_length'], min=None, max=None)] | Will generate data of shape: [1, 1].
If this is incorrect, please provide a custom data loader.
[W] Input tensor: tone [shape=BoundedShape([1, 'text_length'], min=None, max=None)] | Will generate data of shape: [1, 1].
If this is incorrect, please provide a custom data loader.
[W] Input tensor: language [shape=BoundedShape([1, 'text_length'], min=None, max=None)] | Will generate data of shape: [1, 1].
If this is incorrect, please provide a custom data loader.
[W] Input tensor: bert [shape=BoundedShape([1, 1024, 'text_length'], min=None, max=None)] | Will generate data of shape: [1, 1024, 1].
If this is incorrect, please provide a custom data loader.
[W] Input tensor: ja_bert [shape=BoundedShape([1, 1024, 'text_length'], min=None, max=None)] | Will generate data of shape: [1, 1024, 1].
If this is incorrect, please provide a custom data loader.
[W] Input tensor: en_bert [shape=BoundedShape([1, 1024, 'text_length'], min=None, max=None)] | Will generate data of shape: [1, 1024, 1].
If this is incorrect, please provide a custom data loader.
[I] onnxrt-runner-N0-04/15/25-06:28:54
---- Inference Input(s) ----
{x [dtype=int64, shape=(1, 1)],
sid [dtype=int64, shape=(1,)],
tone [dtype=int64, shape=(1, 1)],
language [dtype=int64, shape=(1, 1)],
bert [dtype=float32, shape=(1, 1024, 1)],
ja_bert [dtype=float32, shape=(1, 1024, 1)],
en_bert [dtype=float32, shape=(1, 1024, 1)]}
[I] onnxrt-runner-N0-04/15/25-06:28:54
---- Inference Output(s) ----
{o [dtype=float16, shape=(1, 1, 3584)],
z [dtype=float32, shape=(1, 192, 7)],
attn [dtype=float32, shape=(1, 1, 7, 1)],
duration [dtype=int64, shape=(1,)]}
[I] onnxrt-runner-N0-04/15/25-06:28:54 | Completed 1 iteration(s) in 52.15 ms | Average inference time: 52.15 ms.
[I] PASSED | Runtime: 7.960s | Command: /dh-nas-dev/tujiexu/miniconda3/envs/bert-vits2/bin/polygraphy run /dh-nas-dev/tujiexu/workplace/code/online/deployment/ckpts/bert_vitst/dianx_model_male_00_01_250308/G_model_mix_nolength.onnx --onnxrt
Environment
TensorRT Version: 10.9.0.34
GPU Type: 4090
Nvidia Driver Version: 535.154.05
CUDA Version: 12.2
CUDNN Version: cuda_12.4.r12.4
Operating System + Version: Debian 5.4.250-4-velinux1u1
Python Version (if applicable): 3.11.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.4.0+cu121
Baremetal or Container (if container which image + tag):
Relevant Files
i had uploaded the model to Google Drive for analysis.
Model link : G_model_mix_nolength.onnx - Google Drive
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered