Encounter "Unsupported model IR version: 9, max supported IR version: 8" during deploy custom model in riva for TTS

user122563 · August 28, 2023, 3:47pm

Hardware - GPU RTX3060
Hardware - CPU i7-11700
Operating System: Windows wsl ubuntun 20.04
Riva Version 2.12.1

I try to deploy the Chinese TTS pretrained model provided by NGC to RIVA.
Pretrained weight

I have converted the pretrained model from .nemo to .riva. (the version of nemo2riva is 2.12.0)
And get the <tokenizer_far_file> and <verbalizer_far_file> of Chinese version to do “riva-build”

python pynini_export.py --output_dir zh --grammars tn_grammars --input_case cased --language zh

Below is how I get the .rmir for “riva-deploy”

riva-build speech_synthesis /servicemaker-dev/riva_build/riva2rmir_zh.rmir:nemotoriva \/servicemaker-dev/riva_build/tts_zh_fastpitch_sfspeech.riva:nemotoriva \/servicemaker-dev/riva_build/tts_zh_hifigan_sfspeech.riva:nemotoriva \–voice_name=“zh_test” \–wfst_tokenizer_model=/servicemaker-dev/riva_build/tokenize_and_classify.far \–wfst_verbalizer_model=/servicemaker-dev/riva_build/verbalize.far \–sample_rate=22050 --language_code=“zh-CN”

Then I follow the instructions of Using Quick Start Scripts to Deploy (Recommended)

I finish the bash riva_init.sh with my own .rmir file and generate some files in /models

However, when I run bash riva_start.sh, it shows warning about onnx runtime error.

It seems that the problem is about the IR version of .onnx is too new?
Can I downgrand the IR version of model.onnx from 9 to 8?
Or I need to convert the .nemo with other version of nemo2riva?

Below is the message from docker logs riva-speech.

±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| riva-onnx-fastpitch_encoder-zh_test | 1 | UNAVAILABLE: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-zh_test/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8 |
| riva-trt-hifigan-zh_test | 1 | READY |
| spectrogram_chunker-zh_test | 1 | READY |
| tts_postprocessor-zh_test | 1 | READY |
| tts_preprocessor-zh_test | 1 | READY |

rvinobha · August 28, 2023, 6:48pm

HI @user122563

Thanks for your interest,

Apologies, we don’t support WSL for Windows, Riva needs a Linux

https://docs.nvidia.com/deeplearning/riva/user-guide/docs/support-matrix.html#riva-2-12-0

Thanks

user122563 · August 29, 2023, 1:41am

Thanks for the reply.

Cause I set up the OOTB service of RIVA and generate the .wav file successfully, I think it is possible to deploy the custom model on RIVA through Windows wsl.

I’ll try the same process at Linux.

user122563 · August 29, 2023, 4:30am

Hardware - GPU RTX4070
Hardware - CPU i9-12900
Operating System: Linux Ubuntu 20.04
Riva Version 2.12.1

Hi @rvinobha , I have changed the plateform but meet the same question.
Below is the full message from docker logs riva-speech.

==========================
=== Riva Speech Skills ===
==========================
NVIDIA Release 23.06.1 (build 64517154)
Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:26.138126 101 pinned_memory_manager.cc:240] Pinned memory pool is created at ‘0x7efdaa000000’ with size 268435456
I0829 04:06:26.138393 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000
I0829 04:06:26.141770 101 model_lifecycle.cc:459] loading: riva-onnx-fastpitch_encoder-zh_test:1
I0829 04:06:26.141809 101 model_lifecycle.cc:459] loading: riva-trt-hifigan-zh_test:1
I0829 04:06:26.141848 101 model_lifecycle.cc:459] loading: spectrogram_chunker-zh_test:1
I0829 04:06:26.141878 101 model_lifecycle.cc:459] loading: tts_postprocessor-zh_test:1
I0829 04:06:26.141912 101 model_lifecycle.cc:459] loading: tts_preprocessor-zh_test:1
I0829 04:06:26.142847 101 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime
I0829 04:06:26.142866 101 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10
I0829 04:06:26.142875 101 onnxruntime.cc:2475] ‘onnxruntime’ TRITONBACKEND API version: 1.10
I0829 04:06:26.142880 101 onnxruntime.cc:2505] backend configuration:
{“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}}
I0829 04:06:26.179495 101 tensorrt.cc:5444] TRITONBACKEND_Initialize: tensorrt
I0829 04:06:26.179509 101 tensorrt.cc:5454] Triton TRITONBACKEND API version: 1.10
I0829 04:06:26.179514 101 tensorrt.cc:5460] ‘tensorrt’ TRITONBACKEND API version: 1.10
I0829 04:06:26.179517 101 tensorrt.cc:5488] backend configuration:
{“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}}
I0829 04:06:26.423976 101 tensorrt.cc:5578] TRITONBACKEND_ModelInitialize: riva-trt-hifigan-zh_test (version 1)
I0829 04:06:26.424462 101 backend_model.cc:188] Overriding execution policy to “TRITONBACKEND_EXECUTION_BLOCKING” for sequence model “riva-trt-hifigan-zh_test”
I0829 04:06:26.424494 101 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: riva-onnx-fastpitch_encoder-zh_test (version 1)
I0829 04:06:26.424915 101 spectrogram-chunker.cc:270] TRITONBACKEND_ModelInitialize: spectrogram_chunker-zh_test (version 1)
I0829 04:06:26.425378 101 backend_model.cc:303] model configuration:
{
“name”: “spectrogram_chunker-zh_test”,
“platform”: “”,
“backend”: “riva_tts_chunker”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “SPECTROGRAM”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
80,
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “IS_LAST_SENTENCE”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “NUM_VALID_FRAMES_IN”,
“data_type”: “TYPE_INT64”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “SENTENCE_NUM”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “DURATIONS”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “PROCESSED_TEXT”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “VOLUME”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “SPECTROGRAM_CHUNK”,
“data_type”: “TYPE_FP32”,
“dims”: [
80,
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “END_FLAG”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “NUM_VALID_SAMPLES_OUT”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “SENTENCE_NUM”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “DURATIONS”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “PROCESSED_TEXT”,
“data_type”: “TYPE_STRING”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “VOLUME”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“priority”: “PRIORITY_DEFAULT”,
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 1000
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “spectrogram_chunker-zh_test_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“num_samples_per_frame”: {
“string_value”: “256”
},
“supports_volume”: {
“string_value”: “True”
},
“max_execution_batch_size”: {
“string_value”: “8”
},
“chunk_length”: {
“string_value”: “80”
},
“num_mels”: {
“string_value”: “80”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: true
}
}
I0829 04:06:26.425423 101 spectrogram-chunker.cc:272] TRITONBACKEND_ModelInstanceInitialize: spectrogram_chunker-zh_test_0 (device 0)
I0829 04:06:26.432635 101 tts-postprocessor.cc:305] TRITONBACKEND_ModelInitialize: tts_postprocessor-zh_test (version 1)
I0829 04:06:26.433048 101 backend_model.cc:303] model configuration:
{
“name”: “tts_postprocessor-zh_test”,
“platform”: “”,
“backend”: “riva_tts_postprocessor”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “INPUT”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
1,
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “NUM_VALID_SAMPLES”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “Prosody_volume”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “OUTPUT”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“priority”: “PRIORITY_DEFAULT”,
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 100
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “tts_postprocessor-zh_test_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“supports_volume”: {
“string_value”: “True”
},
“max_chunk_size”: {
“string_value”: “65536”
},
“chunk_num_samples”: {
“string_value”: “20480”
},
“fade_length”: {
“string_value”: “128”
},
“num_samples_per_frame”: {
“string_value”: “256”
},
“hop_length”: {
“string_value”: “256”
},
“filter_length”: {
“string_value”: “1024”
},
“use_denoiser”: {
“string_value”: “False”
},
“max_execution_batch_size”: {
“string_value”: “8”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: false
}
}

I0829 04:06:26.433069 101 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-hifigan-zh_test_0 (GPU device 0)
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:27.021194 101 logging.cc:49] Loaded engine size: 28 MiB
I0829 04:06:27.161244 101 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +28, now: CPU 0, GPU 28 (MiB)
I0829 04:06:27.178625 101 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +92, now: CPU 0, GPU 120 (MiB)
W0829 04:06:27.178648 101 logging.cc:46] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See CUDA_MODULE_LOADING in CUDA C++ Programming Guide
I0829 04:06:27.178911 101 tensorrt.cc:1547] Created instance riva-trt-hifigan-zh_test_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 04:06:27.178986 101 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: riva-onnx-fastpitch_encoder-zh_test_0 (GPU device 0)
I0829 04:06:27.508110 101 model_lifecycle.cc:693] successfully loaded ‘riva-trt-hifigan-zh_test’ version 1
I0829 04:06:27.508186 101 model_lifecycle.cc:693] successfully loaded ‘spectrogram_chunker-zh_test’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:27.623317 101 onnxruntime.cc:2640] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:27.623331 101 tts-preprocessor.cc:277] TRITONBACKEND_ModelInitialize: tts_preprocessor-zh_test (version 1)
I0829 04:06:27.623357 101 onnxruntime.cc:2586] TRITONBACKEND_ModelFinalize: delete model state
E0829 04:06:27.623375 101 model_lifecycle.cc:596] failed to load ‘riva-onnx-fastpitch_encoder-zh_test’ version 1: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-zh_test/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList* , const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8

WARNING: Logging before InitGoogleLogging() is written to STDERR
I0829 04:06:27.623776 111 preprocessor.cc:228] TTS character mapping loaded from /data/models/tts_preprocessor-zh_test/1/mapping.txt
I0829 04:06:27.624769 111 preprocessor.cc:265] TTS phonetic mapping loaded from /data/models/tts_preprocessor-zh_test/1/4b5a9935d8ca4948aca1fe62e17e1dad_pinyin_dict_nv_22.10.txt
I0829 04:06:27.624787 111 normalize.cc:52] Speech Class far file missing:/data/models/tts_preprocessor-zh_test/1/speech_class.far
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:28.793210 111 preprocessor.cc:288] TTS normalizer loaded from /data/models/tts_preprocessor-zh_test/1/
I0829 04:06:28.793290 101 backend_model.cc:303] model configuration:
{
“name”: “tts_preprocessor-zh_test”,
“platform”: “”,
“backend”: “riva_tts_preprocessor”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “input_string”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “output”,
“data_type”: “TYPE_INT64”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_mask”,
“data_type”: “TYPE_FP32”,
“dims”: [
1,
400,
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_length”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “is_last_sentence”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_string”,
“data_type”: “TYPE_STRING”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “sentence_num”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “pitch”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “duration”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “volume”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“graph”: {
“level”: 0
},
“priority”: “PRIORITY_DEFAULT”,
“cuda”: {
“graphs”: false,
“busy_wait_events”: false,
“graph_spec”: ,
“output_copy_stream”: true
},
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 100
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “tts_preprocessor-zh_test_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“phone_set”: {
“string_value”: “arpabet”
},
“pitch_std”: {
“string_value”: “58.773109436035156”
},
“mapping_path”: {
“string_value”: “/data/models/tts_preprocessor-zh_test/1/mapping.txt”
},
“pad_with_space”: {
“string_value”: “True”
},
“dictionary_path”: {
“string_value”: “/data/models/tts_preprocessor-zh_test/1/4b5a9935d8ca4948aca1fe62e17e1dad_pinyin_dict_nv_22.10.txt”
},
“normalize_pitch”: {
“string_value”: “True”
},
“upper_case_g2p”: {
“string_value”: “True”
},
“upper_case_chars”: {
“string_value”: “False”
},
“max_input_length”: {
“string_value”: “2000”
},
“g2p_ignore_ambiguous”: {
“string_value”: “True”
},
“supports_ragged_batches”: {
“string_value”: “True”
},
“max_sequence_length”: {
“string_value”: “400”
},
“language”: {
“string_value”: “en-US”
},
“norm_proto_path”: {
“string_value”: “/data/models/tts_preprocessor-zh_test/1/”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: true
}
}

I0829 04:06:28.793341 101 tts-postprocessor.cc:307] TRITONBACKEND_ModelInstanceInitialize: tts_postprocessor-zh_test_0 (device 0)
I0829 04:06:28.801867 101 tts-preprocessor.cc:279] TRITONBACKEND_ModelInstanceInitialize: tts_preprocessor-zh_test_0 (device 0)
I0829 04:06:28.802161 101 model_lifecycle.cc:693] successfully loaded ‘tts_preprocessor-zh_test’ version 1
I0829 04:06:28.802248 101 model_lifecycle.cc:693] successfully loaded ‘tts_postprocessor-zh_test’ version 1
E0829 04:06:28.802317 101 model_repository_manager.cc:481] Invalid argument: ensemble ‘fastpitch_hifigan_ensemble-zh_test’ depends on ‘riva-onnx-fastpitch_encoder-zh_test’ which has no loaded version

I0829 04:06:28.802351 101 server.cc:563]
±-----------------±-----+
| Repository Agent | Path |
±-----------------±-----+
±-----------------±-----+
I0829 04:06:28.802389 101 server.cc:590]
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_postprocessor | /opt/tritonserver/backends/riva_tts_postprocessor/libtriton_riva_tts_postprocessor.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0829 04:06:28.802444 101 server.cc:633]
±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| riva-onnx-fastpitch_encoder-zh_test | 1 | UNAVAILABLE: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-zh_test/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList* , const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8 |
| riva-trt-hifigan-zh_test | 1 | READY |
| spectrogram_chunker-zh_test | 1 | READY |
| tts_postprocessor-zh_test | 1 | READY |
| tts_preprocessor-zh_test | 1 | READY |

±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0829 04:06:28.813368 101 metrics.cc:864] Collecting metrics for GPU 0: NVIDIA GeForce RTX 4070
I0829 04:06:28.813475 101 metrics.cc:757] Collecting CPU metrics
I0829 04:06:28.813566 101 tritonserver.cc:2264]
±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.27.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging |
| model_repository_path[0] | /data/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 1000000000 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0829 04:06:28.813570 101 server.cc:264] Waiting for in-flight requests to complete.
I0829 04:06:28.813574 101 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences
I0829 04:06:28.813631 101 server.cc:295] All models are stopped, unloading models
I0829 04:06:28.813634 101 server.cc:302] Timeout 30: Found 4 live models and 0 in-flight non-inference requests
I0829 04:06:28.813676 101 tts-postprocessor.cc:310] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813706 101 tts-preprocessor.cc:282] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813724 101 tensorrt.cc:5665] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813739 101 tts-preprocessor.cc:278] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.813771 101 spectrogram-chunker.cc:275] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813794 101 spectrogram-chunker.cc:271] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.814087 101 model_lifecycle.cc:578] successfully unloaded ‘spectrogram_chunker-zh_test’ version 1
I0829 04:06:28.815058 101 tts-postprocessor.cc:306] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.815126 101 model_lifecycle.cc:578] successfully unloaded ‘tts_postprocessor-zh_test’ version 1
I0829 04:06:28.828984 101 tensorrt.cc:5604] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.829262 101 model_lifecycle.cc:578] successfully unloaded ‘riva-trt-hifigan-zh_test’ version 1
I0829 04:06:28.907600 101 model_lifecycle.cc:578] successfully unloaded ‘tts_preprocessor-zh_test’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:29.813732 101 server.cc:302] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Triton server died before reaching ready state. Terminating Riva startup.
Check Triton logs with: docker logs
/opt/riva/bin/start-riva: line 1: kill: (101) - No such process

arthurwu4work · September 11, 2023, 2:20am

Hi @user122563 ,

I got the same error
Did you solve it ?

thank you

daniel.levine · September 28, 2023, 5:15pm

I get this same error. I’m following the Riva tts-finetune-nemo.ipynb and tts-deploy.ipynb tutorials with standard retrained EN-US speaker. I am using Riva Quickstart 2.13.0.

giuseppe.span23 · December 4, 2023, 5:01pm

Any news on this?

francois20 · January 22, 2024, 8:43am

Has anybody found a solution for this? I’ve tried converting a fine tuned fastpitch and hifigan with every combination of the following packages (training / conversion). No luck so far.

nvcr.io/nvidia/nemo:23.06
nemo-toolkit==1.22.0
nemo-toolkit==1.21.0rc0
nemo2riva==2.14.0
nemo2riva==2.13.0
nemo2riva==2.12.0
riva_quickstart_v2.12.1
riva_quickstart_v2.13.0
riva_quickstart_v2.14.0

Is TTS nemo to riva conversion fully supported yet? I get exactly the same error as above.

ilb · January 22, 2024, 8:49am

@francois20

You may try to convert from nemo to Riva with an earlier version, i.e. nemo → riva with riva 2.11.0, but deploy on riva 2.12.1,…; that worked for me.

There’s an open issue in nemo2riva regarding this Conformer CTC converted with nemo2riva 2.13.1 deployed on Riva 2.13.1 fails to load · Issue #36 · nvidia-riva/nemo2riva · GitHub. Under the Riva 2.14.0 release notes, known issues, you will find:

When generating .riva models from .nemo using nemo2riva, the nemo:23.08 image is not compatible with Riva due to updated torch version. To avoid any Riva deployment issues, the recommendation is to continue using the last working nemo image.

francois20 · January 22, 2024, 10:29am

Hi @ilb. Thanks for taking the time to reply!

I still have that issue with the following environments.

Training environment:
nemo-toolkit 1.21.0rc0
torch 2.1.2

Conversion environment:
nemo-toolkit 1.20.0
nemo2riva 2.11.0
torch 2.1.2

Given what you said the torch version would be my issue. Will it be possible to convert a model trained with torch 2.x.x even if I convert it in a torch 1.x.x. env, or will I need to retrain my model?

Topic		Replies	Views
Run init_start.sh failed Riva riva	1	1264	April 12, 2022
Nvidia Riva health check fail Riva riva	1	461	February 14, 2025
Not able to run LM fine tuned qurtznet model Riva riva	13	1264	October 8, 2021
Riva 1.8 riva_start.sh fail when build with language model Riva riva	3	1169	July 27, 2022
How can I start Riva without an error Riva riva	7	2543	September 29, 2021
Orin Nano 8gb RIVA Docker fails to load (network issue in logs) Jetson Orin Nano docker	6	786	October 12, 2023
Recreate QuickStart Stock Citrinet Model with Modified Parameters Riva	14	1713	August 4, 2022
Riva 2.0 ASR not working Riva	2	859	May 18, 2022
Riva_start.sh will not start the server Riva riva	4	1110	August 31, 2023
Riva Quickstart 2.1.0 installation fails on AGX Orin Riva riva	13	1409	October 17, 2022

Encounter "Unsupported model IR version: 9, max supported IR version: 8" during deploy custom model in riva for TTS

Related topics