Hardware - GPU RTX4070
Hardware - CPU i9-12900
Operating System: Linux Ubuntu 20.04
Riva Version 2.12.1
Hi @rvinobha , I have changed the plateform but meet the same question.
Below is the full message from docker logs riva-speech.
==========================
=== Riva Speech Skills ===
==========================
NVIDIA Release 23.06.1 (build 64517154)
Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:26.138126 101 pinned_memory_manager.cc:240] Pinned memory pool is created at ‘0x7efdaa000000’ with size 268435456
I0829 04:06:26.138393 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000
I0829 04:06:26.141770 101 model_lifecycle.cc:459] loading: riva-onnx-fastpitch_encoder-zh_test:1
I0829 04:06:26.141809 101 model_lifecycle.cc:459] loading: riva-trt-hifigan-zh_test:1
I0829 04:06:26.141848 101 model_lifecycle.cc:459] loading: spectrogram_chunker-zh_test:1
I0829 04:06:26.141878 101 model_lifecycle.cc:459] loading: tts_postprocessor-zh_test:1
I0829 04:06:26.141912 101 model_lifecycle.cc:459] loading: tts_preprocessor-zh_test:1
I0829 04:06:26.142847 101 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime
I0829 04:06:26.142866 101 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10
I0829 04:06:26.142875 101 onnxruntime.cc:2475] ‘onnxruntime’ TRITONBACKEND API version: 1.10
I0829 04:06:26.142880 101 onnxruntime.cc:2505] backend configuration:
{“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}}
I0829 04:06:26.179495 101 tensorrt.cc:5444] TRITONBACKEND_Initialize: tensorrt
I0829 04:06:26.179509 101 tensorrt.cc:5454] Triton TRITONBACKEND API version: 1.10
I0829 04:06:26.179514 101 tensorrt.cc:5460] ‘tensorrt’ TRITONBACKEND API version: 1.10
I0829 04:06:26.179517 101 tensorrt.cc:5488] backend configuration:
{“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}}
I0829 04:06:26.423976 101 tensorrt.cc:5578] TRITONBACKEND_ModelInitialize: riva-trt-hifigan-zh_test (version 1)
I0829 04:06:26.424462 101 backend_model.cc:188] Overriding execution policy to “TRITONBACKEND_EXECUTION_BLOCKING” for sequence model “riva-trt-hifigan-zh_test”
I0829 04:06:26.424494 101 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: riva-onnx-fastpitch_encoder-zh_test (version 1)
I0829 04:06:26.424915 101 spectrogram-chunker.cc:270] TRITONBACKEND_ModelInitialize: spectrogram_chunker-zh_test (version 1)
I0829 04:06:26.425378 101 backend_model.cc:303] model configuration:
{
“name”: “spectrogram_chunker-zh_test”,
“platform”: “”,
“backend”: “riva_tts_chunker”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “SPECTROGRAM”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
80,
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “IS_LAST_SENTENCE”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “NUM_VALID_FRAMES_IN”,
“data_type”: “TYPE_INT64”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “SENTENCE_NUM”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “DURATIONS”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “PROCESSED_TEXT”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “VOLUME”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “SPECTROGRAM_CHUNK”,
“data_type”: “TYPE_FP32”,
“dims”: [
80,
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “END_FLAG”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “NUM_VALID_SAMPLES_OUT”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “SENTENCE_NUM”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “DURATIONS”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “PROCESSED_TEXT”,
“data_type”: “TYPE_STRING”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “VOLUME”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“priority”: “PRIORITY_DEFAULT”,
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 1000
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “spectrogram_chunker-zh_test_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“num_samples_per_frame”: {
“string_value”: “256”
},
“supports_volume”: {
“string_value”: “True”
},
“max_execution_batch_size”: {
“string_value”: “8”
},
“chunk_length”: {
“string_value”: “80”
},
“num_mels”: {
“string_value”: “80”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: true
}
}
I0829 04:06:26.425423 101 spectrogram-chunker.cc:272] TRITONBACKEND_ModelInstanceInitialize: spectrogram_chunker-zh_test_0 (device 0)
I0829 04:06:26.432635 101 tts-postprocessor.cc:305] TRITONBACKEND_ModelInitialize: tts_postprocessor-zh_test (version 1)
I0829 04:06:26.433048 101 backend_model.cc:303] model configuration:
{
“name”: “tts_postprocessor-zh_test”,
“platform”: “”,
“backend”: “riva_tts_postprocessor”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “INPUT”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
1,
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “NUM_VALID_SAMPLES”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “Prosody_volume”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “OUTPUT”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“priority”: “PRIORITY_DEFAULT”,
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 100
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “tts_postprocessor-zh_test_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“supports_volume”: {
“string_value”: “True”
},
“max_chunk_size”: {
“string_value”: “65536”
},
“chunk_num_samples”: {
“string_value”: “20480”
},
“fade_length”: {
“string_value”: “128”
},
“num_samples_per_frame”: {
“string_value”: “256”
},
“hop_length”: {
“string_value”: “256”
},
“filter_length”: {
“string_value”: “1024”
},
“use_denoiser”: {
“string_value”: “False”
},
“max_execution_batch_size”: {
“string_value”: “8”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: false
}
}
I0829 04:06:26.433069 101 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-hifigan-zh_test_0 (GPU device 0)
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:27.021194 101 logging.cc:49] Loaded engine size: 28 MiB
I0829 04:06:27.161244 101 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +28, now: CPU 0, GPU 28 (MiB)
I0829 04:06:27.178625 101 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +92, now: CPU 0, GPU 120 (MiB)
W0829 04:06:27.178648 101 logging.cc:46] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See CUDA_MODULE_LOADING
in CUDA C++ Programming Guide
I0829 04:06:27.178911 101 tensorrt.cc:1547] Created instance riva-trt-hifigan-zh_test_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0829 04:06:27.178986 101 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: riva-onnx-fastpitch_encoder-zh_test_0 (GPU device 0)
I0829 04:06:27.508110 101 model_lifecycle.cc:693] successfully loaded ‘riva-trt-hifigan-zh_test’ version 1
I0829 04:06:27.508186 101 model_lifecycle.cc:693] successfully loaded ‘spectrogram_chunker-zh_test’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:27.623317 101 onnxruntime.cc:2640] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:27.623331 101 tts-preprocessor.cc:277] TRITONBACKEND_ModelInitialize: tts_preprocessor-zh_test (version 1)
I0829 04:06:27.623357 101 onnxruntime.cc:2586] TRITONBACKEND_ModelFinalize: delete model state
E0829 04:06:27.623375 101 model_lifecycle.cc:596] failed to load ‘riva-onnx-fastpitch_encoder-zh_test’ version 1: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-zh_test/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList* , const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0829 04:06:27.623776 111 preprocessor.cc:228] TTS character mapping loaded from /data/models/tts_preprocessor-zh_test/1/mapping.txt
I0829 04:06:27.624769 111 preprocessor.cc:265] TTS phonetic mapping loaded from /data/models/tts_preprocessor-zh_test/1/4b5a9935d8ca4948aca1fe62e17e1dad_pinyin_dict_nv_22.10.txt
I0829 04:06:27.624787 111 normalize.cc:52] Speech Class far file missing:/data/models/tts_preprocessor-zh_test/1/speech_class.far
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:28.793210 111 preprocessor.cc:288] TTS normalizer loaded from /data/models/tts_preprocessor-zh_test/1/
I0829 04:06:28.793290 101 backend_model.cc:303] model configuration:
{
“name”: “tts_preprocessor-zh_test”,
“platform”: “”,
“backend”: “riva_tts_preprocessor”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “input_string”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “output”,
“data_type”: “TYPE_INT64”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_mask”,
“data_type”: “TYPE_FP32”,
“dims”: [
1,
400,
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_length”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “is_last_sentence”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_string”,
“data_type”: “TYPE_STRING”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “sentence_num”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “pitch”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “duration”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “volume”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“graph”: {
“level”: 0
},
“priority”: “PRIORITY_DEFAULT”,
“cuda”: {
“graphs”: false,
“busy_wait_events”: false,
“graph_spec”: ,
“output_copy_stream”: true
},
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 100
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “tts_preprocessor-zh_test_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“phone_set”: {
“string_value”: “arpabet”
},
“pitch_std”: {
“string_value”: “58.773109436035156”
},
“mapping_path”: {
“string_value”: “/data/models/tts_preprocessor-zh_test/1/mapping.txt”
},
“pad_with_space”: {
“string_value”: “True”
},
“dictionary_path”: {
“string_value”: “/data/models/tts_preprocessor-zh_test/1/4b5a9935d8ca4948aca1fe62e17e1dad_pinyin_dict_nv_22.10.txt”
},
“normalize_pitch”: {
“string_value”: “True”
},
“upper_case_g2p”: {
“string_value”: “True”
},
“upper_case_chars”: {
“string_value”: “False”
},
“max_input_length”: {
“string_value”: “2000”
},
“g2p_ignore_ambiguous”: {
“string_value”: “True”
},
“supports_ragged_batches”: {
“string_value”: “True”
},
“max_sequence_length”: {
“string_value”: “400”
},
“language”: {
“string_value”: “en-US”
},
“norm_proto_path”: {
“string_value”: “/data/models/tts_preprocessor-zh_test/1/”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: true
}
}
I0829 04:06:28.793341 101 tts-postprocessor.cc:307] TRITONBACKEND_ModelInstanceInitialize: tts_postprocessor-zh_test_0 (device 0)
I0829 04:06:28.801867 101 tts-preprocessor.cc:279] TRITONBACKEND_ModelInstanceInitialize: tts_preprocessor-zh_test_0 (device 0)
I0829 04:06:28.802161 101 model_lifecycle.cc:693] successfully loaded ‘tts_preprocessor-zh_test’ version 1
I0829 04:06:28.802248 101 model_lifecycle.cc:693] successfully loaded ‘tts_postprocessor-zh_test’ version 1
E0829 04:06:28.802317 101 model_repository_manager.cc:481] Invalid argument: ensemble ‘fastpitch_hifigan_ensemble-zh_test’ depends on ‘riva-onnx-fastpitch_encoder-zh_test’ which has no loaded version
I0829 04:06:28.802351 101 server.cc:563]
±-----------------±-----+
| Repository Agent | Path |
±-----------------±-----+
±-----------------±-----+
I0829 04:06:28.802389 101 server.cc:590]
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_postprocessor | /opt/tritonserver/backends/riva_tts_postprocessor/libtriton_riva_tts_postprocessor.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0829 04:06:28.802444 101 server.cc:633]
±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| riva-onnx-fastpitch_encoder-zh_test | 1 | UNAVAILABLE: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-zh_test/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList* , const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8 |
| riva-trt-hifigan-zh_test | 1 | READY |
| spectrogram_chunker-zh_test | 1 | READY |
| tts_postprocessor-zh_test | 1 | READY |
| tts_preprocessor-zh_test | 1 | READY |
±------------------------------------±--------±-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0829 04:06:28.813368 101 metrics.cc:864] Collecting metrics for GPU 0: NVIDIA GeForce RTX 4070
I0829 04:06:28.813475 101 metrics.cc:757] Collecting CPU metrics
I0829 04:06:28.813566 101 tritonserver.cc:2264]
±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.27.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging |
| model_repository_path[0] | /data/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 1000000000 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0829 04:06:28.813570 101 server.cc:264] Waiting for in-flight requests to complete.
I0829 04:06:28.813574 101 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences
I0829 04:06:28.813631 101 server.cc:295] All models are stopped, unloading models
I0829 04:06:28.813634 101 server.cc:302] Timeout 30: Found 4 live models and 0 in-flight non-inference requests
I0829 04:06:28.813676 101 tts-postprocessor.cc:310] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813706 101 tts-preprocessor.cc:282] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813724 101 tensorrt.cc:5665] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813739 101 tts-preprocessor.cc:278] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.813771 101 spectrogram-chunker.cc:275] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0829 04:06:28.813794 101 spectrogram-chunker.cc:271] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.814087 101 model_lifecycle.cc:578] successfully unloaded ‘spectrogram_chunker-zh_test’ version 1
I0829 04:06:28.815058 101 tts-postprocessor.cc:306] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.815126 101 model_lifecycle.cc:578] successfully unloaded ‘tts_postprocessor-zh_test’ version 1
I0829 04:06:28.828984 101 tensorrt.cc:5604] TRITONBACKEND_ModelFinalize: delete model state
I0829 04:06:28.829262 101 model_lifecycle.cc:578] successfully unloaded ‘riva-trt-hifigan-zh_test’ version 1
I0829 04:06:28.907600 101 model_lifecycle.cc:578] successfully unloaded ‘tts_preprocessor-zh_test’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
I0829 04:06:29.813732 101 server.cc:302] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Triton server died before reaching ready state. Terminating Riva startup.
Check Triton logs with: docker logs
/opt/riva/bin/start-riva: line 1: kill: (101) - No such process