Nvidia Riva health check fail

Please provide the following information when requesting support:

Hardware - GPU RTX 4060
Hardware - AMD CPU
WSL
Riva version 2.14.0
Error:
Starting Riva Speech Services. This may take several minutes depending on the number of models deployed.
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Health ready check failed.
Check Riva logs with: docker logs riva-speech
Log:
W0228 15:53:06.619563 116 parameter_parser.cc:146] Parameter ‘stop_history’ set but unused.
W0228 15:53:06.619565 116 parameter_parser.cc:146] Parameter ‘stop_th’ set but unused.
W0228 15:53:06.619567 116 parameter_parser.cc:146] Parameter ‘streaming’ set but unused.
W0228 15:53:06.619570 116 parameter_parser.cc:146] Parameter ‘use_subword’ set but unused.
W0228 15:53:06.619571 116 parameter_parser.cc:146] Parameter ‘vocab_file’ set but unused.
I0228 15:53:06.619705 104 backend_model.cc:303] model configuration:
{
“name”: “conformer-en-US-asr-streaming-endpointing-streaming”,
“platform”: “”,
“backend”: “riva_asr_endpointing”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 2048,
“input”: [
{
“name”: “CLASS_LOGITS”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1,
129
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “SEGMENTS_START_END”,
“data_type”: “TYPE_INT32”,
“dims”: [
-1,
2
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“graph”: {
“level”: 0
},
“priority”: “PRIORITY_DEFAULT”,
“cuda”: {
“graphs”: false,
“busy_wait_events”: false,
“graph_spec”: ,
“output_copy_stream”: true
},
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “conformer-en-US-asr-streaming-endpointing-streaming_0”,
“kind”: “KIND_CPU”,
“count”: 1,
“gpus”: ,
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“use_subword”: {
“string_value”: “True”
},
“vocab_file”: {
“string_value”: “/data/models/conformer-en-US-asr-streaming-endpointing-streaming/1/riva_decoder_vocabulary.txt”
},
“ms_per_timestep”: {
“string_value”: “40”
},
“start_th”: {
“string_value”: “0.2”
},
“endpointing_type”: {
“string_value”: “greedy_ctc”
},
“stop_th”: {
“string_value”: “0.98”
},
“residue_blanks_at_start”: {
“string_value”: “-2”
},
“chunk_size”: {
“string_value”: “0.16”
},
“streaming”: {
“string_value”: “True”
},
“residue_blanks_at_end”: {
“string_value”: “0”
},
“stop_history”: {
“string_value”: “800”
},
“start_history”: {
“string_value”: “200”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: false
}
}
I0228 15:53:06.619775 104 ctc-decoder-library.cc:21] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1)
W0228 15:53:06.620535 115 parameter_parser.cc:146] Parameter ‘append_space_to_transcripts’ set but unused.
W0228 15:53:06.620560 115 parameter_parser.cc:146] Parameter ‘beam_size’ set but unused.
W0228 15:53:06.620564 115 parameter_parser.cc:146] Parameter ‘beam_size_token’ set but unused.
W0228 15:53:06.620566 115 parameter_parser.cc:146] Parameter ‘beam_threshold’ set but unused.
W0228 15:53:06.620568 115 parameter_parser.cc:146] Parameter ‘blank_token’ set but unused.
W0228 15:53:06.620570 115 parameter_parser.cc:146] Parameter ‘cased’ set but unused.
W0228 15:53:06.620573 115 parameter_parser.cc:146] Parameter ‘decoder_num_worker_threads’ set but unused.
W0228 15:53:06.620575 115 parameter_parser.cc:146] Parameter ‘forerunner_beam_size’ set but unused.
W0228 15:53:06.620577 115 parameter_parser.cc:146] Parameter ‘forerunner_beam_size_token’ set but unused.
W0228 15:53:06.620579 115 parameter_parser.cc:146] Parameter ‘forerunner_beam_threshold’ set but unused.
W0228 15:53:06.620581 115 parameter_parser.cc:146] Parameter ‘forerunner_use_lm’ set but unused.
W0228 15:53:06.620584 115 parameter_parser.cc:146] Parameter ‘language_model_file’ set but unused.
W0228 15:53:06.620586 115 parameter_parser.cc:146] Parameter ‘lexicon_file’ set but unused.
W0228 15:53:06.620589 115 parameter_parser.cc:146] Parameter ‘lm_weight’ set but unused.
W0228 15:53:06.620590 115 parameter_parser.cc:146] Parameter ‘log_add’ set but unused.
W0228 15:53:06.620592 115 parameter_parser.cc:146] Parameter ‘max_execution_batch_size’ set but unused.
W0228 15:53:06.620599 115 parameter_parser.cc:146] Parameter ‘max_supported_transcripts’ set but unused.
W0228 15:53:06.620600 115 parameter_parser.cc:146] Parameter ‘num_tokenization’ set but unused.
W0228 15:53:06.620602 115 parameter_parser.cc:146] Parameter ‘profane_words_file’ set but unused.
W0228 15:53:06.620606 115 parameter_parser.cc:146] Parameter ‘return_separate_utterances’ set but unused.
W0228 15:53:06.620609 115 parameter_parser.cc:146] Parameter ‘set_default_index_to_unk_token’ set but unused.
W0228 15:53:06.620611 115 parameter_parser.cc:146] Parameter ‘sil_token’ set but unused.
W0228 15:53:06.620613 115 parameter_parser.cc:146] Parameter ‘smearing_mode’ set but unused.
W0228 15:53:06.620616 115 parameter_parser.cc:146] Parameter ‘tokenizer_model’ set but unused.
W0228 15:53:06.620618 115 parameter_parser.cc:146] Parameter ‘unk_score’ set but unused.
W0228 15:53:06.620621 115 parameter_parser.cc:146] Parameter ‘unk_token’ set but unused.
W0228 15:53:06.620623 115 parameter_parser.cc:146] Parameter ‘use_lexicon_free_decoding’ set but unused.
W0228 15:53:06.620627 115 parameter_parser.cc:146] Parameter ‘vocab_file’ set but unused.
W0228 15:53:06.620630 115 parameter_parser.cc:146] Parameter ‘word_insertion_score’ set but unused.
I0228 15:53:06.620939 104 backend_model.cc:303] model configuration:
{
“name”: “conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming”,
“platform”: “”,
“backend”: “riva_asr_decoder”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 1024,
“input”: [
{
“name”: “CLASS_LOGITS”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1,
129
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “END_FLAG”,
“data_type”: “TYPE_UINT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “SEGMENTS_START_END”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1,
2
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “CUSTOM_CONFIGURATION”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
-1,
2
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “FINAL_TRANSCRIPTS”,
“data_type”: “TYPE_STRING”,
“dims”: [
-1,
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “FINAL_TRANSCRIPTS_SCORE”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “FINAL_WORDS_START_END”,
“data_type”: “TYPE_INT32”,
“dims”: [
-1,
2
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “PARTIAL_TRANSCRIPTS”,
“data_type”: “TYPE_STRING”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “PARTIAL_TRANSCRIPTS_STABILITY”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “PARTIAL_WORDS_START_END”,
“data_type”: “TYPE_INT32”,
“dims”: [
-1,
2
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “FINAL_WORDS_SCORE”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “PARTIAL_WORDS_SCORE”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“graph”: {
“level”: 0
},
“priority”: “PRIORITY_DEFAULT”,
“cuda”: {
“graphs”: false,
“busy_wait_events”: false,
“graph_spec”: ,
“output_copy_stream”: true
},
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 1024,
“preferred_batch_size”: [
32,
64
],
“max_queue_delay_microseconds”: 1000
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0”,
“kind”: “KIND_CPU”,
“count”: 1,
“gpus”: ,
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“chunk_size”: {
“string_value”: “0.16”
},
“use_lexicon_free_decoding”: {
“string_value”: “False”
},
“smearing_mode”: {
“string_value”: “max”
},
“return_separate_utterances”: {
“string_value”: “False”
},
“append_space_to_transcripts”: {
“string_value”: “True”
},
“use_subword”: {
“string_value”: “True”
},
“cased”: {
“string_value”: “False”
},
“set_default_index_to_unk_token”: {
“string_value”: “False”
},
“vocab_file”: {
“string_value”: “/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt”
},
“sil_token”: {
“string_value”: “▁”
},
“beam_threshold”: {
“string_value”: “20.0”
},
“streaming”: {
“string_value”: “True”
},
“max_execution_batch_size”: {
“string_value”: “1024”
},
“decoder_num_worker_threads”: {
“string_value”: “-1”
},
“lm_weight”: {
“string_value”: “0.8”
},
“num_tokenization”: {
“string_value”: “1”
},
“tokenizer_model”: {
“string_value”: “/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/15b100c6e0e74ae598e08fcd85d9f2f6_tokenizer.model”
},
“decoder_type”: {
“string_value”: “flashlight”
},
“language_model_file”: {
“string_value”: “/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/en-US_default_6.0.bin”
},
“right_padding_size”: {
“string_value”: “1.92”
},
“word_insertion_score”: {
“string_value”: “1.0”
},
“blank_token”: {
“string_value”: “#”
},
“max_supported_transcripts”: {
“string_value”: “1”
},
“lexicon_file”: {
“string_value”: “/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt”
},
“asr_model_delay”: {
“string_value”: “-1”
},
“forerunner_use_lm”: {
“string_value”: “true”
},
“left_padding_size”: {
“string_value”: “1.92”
},
“log_add”: {
“string_value”: “True”
},
“unk_score”: {
“string_value”: “-inf”
},
“force_decoder_reset_after_ms”: {
“string_value”: “-1”
},
“profane_words_file”: {
“string_value”: “/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/profane_words_file.txt”
},
“ms_per_timestep”: {
“string_value”: “40”
},
“unk_token”: {
“string_value”: “”
},
“beam_size”: {
“string_value”: “32”
},
“forerunner_beam_size”: {
“string_value”: “8”
},
“forerunner_beam_threshold”: {
“string_value”: “10.0”
},
“forerunner_beam_size_token”: {
“string_value”: “8”
},
“beam_size_token”: {
“string_value”: “16”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: false
}
}
I0228 15:53:06.622288 104 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime
I0228 15:53:06.622316 104 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10
I0228 15:53:06.622321 104 onnxruntime.cc:2475] ‘onnxruntime’ TRITONBACKEND API version: 1.10
I0228 15:53:06.622324 104 onnxruntime.cc:2505] backend configuration:
{“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}}
I0228 15:53:06.630989 104 feature-extractor.cc:417] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-offline-feature-extractor-streaming-offline_0 (device 0)

Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:07.025173 104 endpointing_library.cc:24] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-endpointing-streaming_0 (device 0)
I0228 15:53:07.026510 104 model_lifecycle.cc:693] successfully loaded ‘conformer-en-US-asr-offline-feature-extractor-streaming-offline’ version 1
I0228 15:53:07.051400 104 ctc-decoder-library.cc:25] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0)
I0228 15:53:07.052606 104 model_lifecycle.cc:693] successfully loaded ‘conformer-en-US-asr-streaming-endpointing-streaming’ version 1
I0228 15:53:07.726101 115 ctc-decoder.cc:179] Beam Decoder initialized successfully!
I0228 15:53:07.726206 104 feature-extractor.cc:415] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-feature-extractor-streaming (version 1)
I0228 15:53:07.727175 104 model_lifecycle.cc:693] successfully loaded ‘conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming’ version 1
I0228 15:53:07.727412 104 backend_model.cc:303] model configuration:
{
“name”: “conformer-en-US-asr-streaming-feature-extractor-streaming”,
“platform”: “”,
“backend”: “riva_asr_features”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 1024,
“input”: [
{
“name”: “AUDIO_SIGNAL”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “SAMPLE_RATE”,
“data_type”: “TYPE_UINT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “AUDIO_FEATURES”,
“data_type”: “TYPE_FP32”,
“dims”: [
80,
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “AUDIO_PROCESSED”,
“data_type”: “TYPE_FP32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“graph”: {
“level”: 0
},
“priority”: “PRIORITY_DEFAULT”,
“cuda”: {
“graphs”: false,
“busy_wait_events”: false,
“graph_spec”: ,
“output_copy_stream”: true
},
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 1024,
“preferred_batch_size”: [
256,
512
],
“max_queue_delay_microseconds”: 1000
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “conformer-en-US-asr-streaming-feature-extractor-streaming_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“sample_rate”: {
“string_value”: “16000”
},
“left_padding_size”: {
“string_value”: “1.92”
},
“dither”: {
“string_value”: “0.0”
},
“stddev”: {
“string_value”: “2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181”
},
“apply_normalization”: {
“string_value”: “True”
},
“norm_per_feature”: {
“string_value”: “True”
},
“right_padding_size”: {
“string_value”: “1.92”
},
“num_features”: {
“string_value”: “80”
},
“transpose”: {
“string_value”: “False”
},
“window_size”: {
“string_value”: “0.025”
},
“stddev_floor”: {
“string_value”: “1e-05”
},
“streaming”: {
“string_value”: “True”
},
“max_execution_batch_size”: {
“string_value”: “1024”
},
“precalc_norm_params”: {
“string_value”: “False”
},
“chunk_size”: {
“string_value”: “0.16”
},
“mean”: {
“string_value”: “-11.4412, -9.9334, -9.1292, -9.0365, -9.2804, -9.5643, -9.7342, -9.6925, -9.6333, -9.2808, -9.1887, -9.1422, -9.1397, -9.2028, -9.2749, -9.4776, -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734, -9.9257, -9.6557, -9.1761, -9.6653, -9.7876, -9.7230, -9.7792, -9.7056, -9.2702, -9.4650, -9.2755, -9.1369, -9.1174, -8.9197, -8.5394, -8.2614, -8.1353, -8.1422, -8.3430, -8.6655”
},
“window_stride”: {
“string_value”: “0.01”
},
“use_utterance_norm_params”: {
“string_value”: “False”
},
“gain”: {
“string_value”: “1.0”
},
“precalc_norm_time_steps”: {
“string_value”: “0”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: false
}
}
I0228 15:53:07.727511 104 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: riva-onnx-fastpitch_encoder-English-US (version 1)
I0228 15:53:07.736406 104 feature-extractor.cc:417] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-feature-extractor-streaming_0 (device 0)
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:09.190858 104 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: riva-onnx-fastpitch_encoder-English-US_0 (GPU device 0)
I0228 15:53:09.191794 104 model_lifecycle.cc:693] successfully loaded ‘conformer-en-US-asr-streaming-feature-extractor-streaming’ version 1
2024-02-28 15:53:09.895685956 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-02-28 15:53:09.895731996 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:12.005835 104 model_lifecycle.cc:693] successfully loaded ‘riva-onnx-fastpitch_encoder-English-US’ version 1
I0228 15:53:12.006033 104 tensorrt.cc:5444] TRITONBACKEND_Initialize: tensorrt
I0228 15:53:12.006064 104 tensorrt.cc:5454] Triton TRITONBACKEND API version: 1.10
I0228 15:53:12.006073 104 tensorrt.cc:5460] ‘tensorrt’ TRITONBACKEND API version: 1.10
I0228 15:53:12.006077 104 tensorrt.cc:5488] backend configuration:
{“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}}
I0228 15:53:12.006241 104 pipeline_library.cc:24] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1)
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0228 15:53:12.006983 119 parameter_parser.cc:146] Parameter ‘attn_mask_tensor_name’ set but unused.
W0228 15:53:12.007023 119 parameter_parser.cc:146] Parameter ‘bos_token’ set but unused.
W0228 15:53:12.007026 119 parameter_parser.cc:146] Parameter ‘capit_logits_tensor_name’ set but unused.
W0228 15:53:12.007028 119 parameter_parser.cc:146] Parameter ‘capitalization_mapping_path’ set but unused.
W0228 15:53:12.007030 119 parameter_parser.cc:146] Parameter ‘delimiter’ set but unused.
W0228 15:53:12.007031 119 parameter_parser.cc:146] Parameter ‘eos_token’ set but unused.
W0228 15:53:12.007033 119 parameter_parser.cc:146] Parameter ‘input_ids_tensor_name’ set but unused.
W0228 15:53:12.007035 119 parameter_parser.cc:146] Parameter ‘language_code’ set but unused.
W0228 15:53:12.007036 119 parameter_parser.cc:146] Parameter ‘load_model’ set but unused.
W0228 15:53:12.007038 119 parameter_parser.cc:146] Parameter ‘model_api’ set but unused.
W0228 15:53:12.007040 119 parameter_parser.cc:146] Parameter ‘model_family’ set but unused.
W0228 15:53:12.007041 119 parameter_parser.cc:146] Parameter ‘model_name’ set but unused.
W0228 15:53:12.007042 119 parameter_parser.cc:146] Parameter ‘pad_chars_with_spaces’ set but unused.
W0228 15:53:12.007045 119 parameter_parser.cc:146] Parameter ‘pipeline_type’ set but unused.
W0228 15:53:12.007045 119 parameter_parser.cc:146] Parameter ‘preserve_accents’ set but unused.
W0228 15:53:12.007047 119 parameter_parser.cc:146] Parameter ‘punct_logits_tensor_name’ set but unused.
W0228 15:53:12.007048 119 parameter_parser.cc:146] Parameter ‘punctuation_mapping_path’ set but unused.
W0228 15:53:12.007050 119 parameter_parser.cc:146] Parameter ‘remove_spaces’ set but unused.
W0228 15:53:12.007052 119 parameter_parser.cc:146] Parameter ‘to_lower’ set but unused.
W0228 15:53:12.007053 119 parameter_parser.cc:146] Parameter ‘token_type_tensor_name’ set but unused.
W0228 15:53:12.007055 119 parameter_parser.cc:146] Parameter ‘tokenizer’ set but unused.
W0228 15:53:12.007057 119 parameter_parser.cc:146] Parameter ‘tokenizer_to_lower’ set but unused.
W0228 15:53:12.007059 119 parameter_parser.cc:146] Parameter ‘unicode_normalize’ set but unused.
W0228 15:53:12.007061 119 parameter_parser.cc:146] Parameter ‘unk_token’ set but unused.
W0228 15:53:12.007062 119 parameter_parser.cc:146] Parameter ‘use_int64_nn_inputs’ set but unused.
W0228 15:53:12.007064 119 parameter_parser.cc:146] Parameter ‘vocab’ set but unused.
W0228 15:53:12.007089 119 parameter_parser.cc:146] Parameter ‘attn_mask_tensor_name’ set but unused.
W0228 15:53:12.007092 119 parameter_parser.cc:146] Parameter ‘bos_token’ set but unused.
W0228 15:53:12.007094 119 parameter_parser.cc:146] Parameter ‘capit_logits_tensor_name’ set but unused.
W0228 15:53:12.007097 119 parameter_parser.cc:146] Parameter ‘capitalization_mapping_path’ set but unused.
W0228 15:53:12.007097 119 parameter_parser.cc:146] Parameter ‘delimiter’ set but unused.
W0228 15:53:12.007099 119 parameter_parser.cc:146] Parameter ‘eos_token’ set but unused.
W0228 15:53:12.007102 119 parameter_parser.cc:146] Parameter ‘input_ids_tensor_name’ set but unused.
W0228 15:53:12.007103 119 parameter_parser.cc:146] Parameter ‘language_code’ set but unused.
W0228 15:53:12.007105 119 parameter_parser.cc:146] Parameter ‘model_api’ set but unused.
W0228 15:53:12.007107 119 parameter_parser.cc:146] Parameter ‘model_family’ set but unused.
W0228 15:53:12.007109 119 parameter_parser.cc:146] Parameter ‘model_name’ set but unused.
W0228 15:53:12.007112 119 parameter_parser.cc:146] Parameter ‘pad_chars_with_spaces’ set but unused.
W0228 15:53:12.007112 119 parameter_parser.cc:146] Parameter ‘preserve_accents’ set but unused.
W0228 15:53:12.007114 119 parameter_parser.cc:146] Parameter ‘punct_logits_tensor_name’ set but unused.
W0228 15:53:12.007117 119 parameter_parser.cc:146] Parameter ‘punctuation_mapping_path’ set but unused.
W0228 15:53:12.007118 119 parameter_parser.cc:146] Parameter ‘remove_spaces’ set but unused.
W0228 15:53:12.007122 119 parameter_parser.cc:146] Parameter ‘to_lower’ set but unused.
W0228 15:53:12.007123 119 parameter_parser.cc:146] Parameter ‘token_type_tensor_name’ set but unused.
W0228 15:53:12.007125 119 parameter_parser.cc:146] Parameter ‘tokenizer’ set but unused.
W0228 15:53:12.007126 119 parameter_parser.cc:146] Parameter ‘tokenizer_to_lower’ set but unused.
W0228 15:53:12.007128 119 parameter_parser.cc:146] Parameter ‘unicode_normalize’ set but unused.
W0228 15:53:12.007130 119 parameter_parser.cc:146] Parameter ‘unk_token’ set but unused.
W0228 15:53:12.007133 119 parameter_parser.cc:146] Parameter ‘use_int64_nn_inputs’ set but unused.
W0228 15:53:12.007134 119 parameter_parser.cc:146] Parameter ‘vocab’ set but unused.
W0228 15:53:12.007148 119 parameter_parser.cc:146] Parameter ‘attn_mask_tensor_name’ set but unused.
W0228 15:53:12.007165 119 parameter_parser.cc:146] Parameter ‘bos_token’ set but unused.
W0228 15:53:12.007169 119 parameter_parser.cc:146] Parameter ‘capit_logits_tensor_name’ set but unused.
W0228 15:53:12.007169 119 parameter_parser.cc:146] Parameter ‘capitalization_mapping_path’ set but unused.
W0228 15:53:12.007171 119 parameter_parser.cc:146] Parameter ‘delimiter’ set but unused.
W0228 15:53:12.007174 119 parameter_parser.cc:146] Parameter ‘eos_token’ set but unused.
W0228 15:53:12.007174 119 parameter_parser.cc:146] Parameter ‘input_ids_tensor_name’ set but unused.
W0228 15:53:12.007176 119 parameter_parser.cc:146] Parameter ‘language_code’ set but unused.
W0228 15:53:12.007179 119 parameter_parser.cc:146] Parameter ‘model_api’ set but unused.
W0228 15:53:12.007180 119 parameter_parser.cc:146] Parameter ‘model_family’ set but unused.
W0228 15:53:12.007182 119 parameter_parser.cc:146] Parameter ‘model_name’ set but unused.
W0228 15:53:12.007184 119 parameter_parser.cc:146] Parameter ‘pad_chars_with_spaces’ set but unused.
W0228 15:53:12.007185 119 parameter_parser.cc:146] Parameter ‘preserve_accents’ set but unused.
W0228 15:53:12.007187 119 parameter_parser.cc:146] Parameter ‘punct_logits_tensor_name’ set but unused.
W0228 15:53:12.007189 119 parameter_parser.cc:146] Parameter ‘punctuation_mapping_path’ set but unused.
W0228 15:53:12.007190 119 parameter_parser.cc:146] Parameter ‘remove_spaces’ set but unused.
W0228 15:53:12.007192 119 parameter_parser.cc:146] Parameter ‘to_lower’ set but unused.
W0228 15:53:12.007194 119 parameter_parser.cc:146] Parameter ‘token_type_tensor_name’ set but unused.
W0228 15:53:12.007196 119 parameter_parser.cc:146] Parameter ‘tokenizer_to_lower’ set but unused.
W0228 15:53:12.007198 119 parameter_parser.cc:146] Parameter ‘unicode_normalize’ set but unused.
W0228 15:53:12.007200 119 parameter_parser.cc:146] Parameter ‘unk_token’ set but unused.
W0228 15:53:12.007201 119 parameter_parser.cc:146] Parameter ‘use_int64_nn_inputs’ set but unused.
W0228 15:53:12.007203 119 parameter_parser.cc:146] Parameter ‘vocab’ set but unused.
W0228 15:53:12.007247 119 parameter_parser.cc:146] Parameter ‘model_api’ set but unused.
W0228 15:53:12.007265 119 parameter_parser.cc:146] Parameter ‘model_family’ set but unused.
I0228 15:53:12.007307 104 backend_model.cc:303] model configuration:
{
“name”: “riva-punctuation-en-US”,
“platform”: “”,
“backend”: “riva_nlp_pipeline”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “PIPELINE_INPUT”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “PIPELINE_OUTPUT”,
“data_type”: “TYPE_STRING”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“priority”: “PRIORITY_DEFAULT”,
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“instance_group”: [
{
“name”: “riva-punctuation-en-US_0”,
“kind”: “KIND_CPU”,
“count”: 1,
“gpus”: ,
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“tokenizer_to_lower”: {
“string_value”: “true”
},
“use_int64_nn_inputs”: {
“string_value”: “False”
},
“bos_token”: {
“string_value”: “[CLS]”
},
“punctuation_mapping_path”: {
“string_value”: “/data/models/riva-punctuation-en-US/1/bf74918539724a61a0d7703134519ea5_punct_label_ids.csv”
},
“delimiter”: {
“string_value”: " "
},
“punct_logits_tensor_name”: {
“string_value”: “punct_logits”
},
“unicode_normalize”: {
“string_value”: “False”
},
“tokenizer”: {
“string_value”: “wordpiece”
},
“input_ids_tensor_name”: {
“string_value”: “input_ids”
},
“attn_mask_tensor_name”: {
“string_value”: “attention_mask”
},
“load_model”: {
“string_value”: “false”
},
“language_code”: {
“string_value”: “en-US”
},
“model_api”: {
“string_value”: “/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText”
},
“token_type_tensor_name”: {
“string_value”: “token_type_ids”
},
“to_lower”: {
“string_value”: “true”
},
“unk_token”: {
“string_value”: “[UNK]”
},
“preserve_accents”: {
“string_value”: “false”
},
“capitalization_mapping_path”: {
“string_value”: “/data/models/riva-punctuation-en-US/1/56633d0a0d8e459b9c8acd572cfa34b8_capit_label_ids.csv”
},
“pipeline_type”: {
“string_value”: “punctuation”
},
“remove_spaces”: {
“string_value”: “False”
},
“pad_chars_with_spaces”: {
“string_value”: “False”
},
“eos_token”: {
“string_value”: “[SEP]”
},
“vocab”: {
“string_value”: “/data/models/riva-punctuation-en-US/1/f92889b136d2433693cb9127e1aea218_vocab.txt”
},
“model_family”: {
“string_value”: “riva”
},
“model_name”: {
“string_value”: “riva-trt-riva-punctuation-en-US-nn-bert-base-uncased”
},
“capit_logits_tensor_name”: {
“string_value”: “capit_logits”
}
},
“model_warmup”:
}
I0228 15:53:12.007364 104 tensorrt.cc:5578] TRITONBACKEND_ModelInitialize: riva-trt-conformer-en-US-asr-offline-am-streaming-offline (version 1)
I0228 15:53:12.007875 104 tensorrt.cc:5578] TRITONBACKEND_ModelInitialize: riva-trt-hifigan-English-US (version 1)
I0228 15:53:12.008345 104 backend_model.cc:188] Overriding execution policy to “TRITONBACKEND_EXECUTION_BLOCKING” for sequence model “riva-trt-hifigan-English-US”
I0228 15:53:12.010323 104 tensorrt.cc:5578] TRITONBACKEND_ModelInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased (version 1)
I0228 15:53:12.010810 104 pipeline_library.cc:28] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0)
I0228 15:53:12.029517 104 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 (GPU device 0)
I0228 15:53:12.029946 104 model_lifecycle.cc:693] successfully loaded ‘riva-punctuation-en-US’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:12.580751 104 logging.cc:49] Loaded engine size: 332 MiB
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:19.029099 104 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1341, GPU 6138 (MiB)
I0228 15:53:19.031132 104 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +317, now: CPU 0, GPU 317 (MiB)
I0228 15:53:19.034325 104 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 676, GPU 6138 (MiB)
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:28.341926 104 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +352, now: CPU 1, GPU 669 (MiB)
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:32.439058 104 tensorrt.cc:1547] Created instance riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0228 15:53:32.439205 104 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-hifigan-English-US_0 (GPU device 0)
I0228 15:53:32.442177 104 model_lifecycle.cc:693] successfully loaded ‘riva-trt-conformer-en-US-asr-offline-am-streaming-offline’ version 1
I0228 15:53:32.488045 104 logging.cc:49] Loaded engine size: 28 MiB
I0228 15:53:32.541526 104 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +28, now: CPU 1, GPU 697 (MiB)
I0228 15:53:32.593478 104 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +181, now: CPU 1, GPU 878 (MiB)
I0228 15:53:32.594766 104 tensorrt.cc:1547] Created instance riva-trt-hifigan-English-US_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0228 15:53:32.594850 104 spectrogram-chunker.cc:270] TRITONBACKEND_ModelInitialize: spectrogram_chunker-English-US (version 1)
I0228 15:53:32.595218 104 model_lifecycle.cc:693] successfully loaded ‘riva-trt-hifigan-English-US’ version 1
I0228 15:53:32.595616 104 backend_model.cc:303] model configuration:
{
“name”: “spectrogram_chunker-English-US”,
“platform”: “”,
“backend”: “riva_tts_chunker”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “SPECTROGRAM”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
80,
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “IS_LAST_SENTENCE”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “NUM_VALID_FRAMES_IN”,
“data_type”: “TYPE_INT64”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “SENTENCE_NUM”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “DURATIONS”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “PROCESSED_TEXT”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “VOLUME”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “SPECTROGRAM_CHUNK”,
“data_type”: “TYPE_FP32”,
“dims”: [
80,
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “END_FLAG”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “NUM_VALID_SAMPLES_OUT”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “SENTENCE_NUM”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “DURATIONS”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “PROCESSED_TEXT”,
“data_type”: “TYPE_STRING”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “VOLUME”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“priority”: “PRIORITY_DEFAULT”,
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 1000
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “spectrogram_chunker-English-US_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“max_execution_batch_size”: {
“string_value”: “8”
},
“num_samples_per_frame”: {
“string_value”: “512”
},
“num_mels”: {
“string_value”: “80”
},
“supports_volume”: {
“string_value”: “True”
},
“chunk_length”: {
“string_value”: “80”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: true
}
}
I0228 15:53:32.597435 104 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 (GPU device 0)
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:53:38.572452 104 logging.cc:49] Loaded engine size: 368 MiB
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:54:00.303806 104 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +367, now: CPU 1, GPU 1245 (MiB)
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:54:13.988743 104 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +45, now: CPU 1, GPU 1290 (MiB)
I0228 15:54:13.989040 104 tensorrt.cc:1547] Created instance riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0228 15:54:13.989090 104 spectrogram-chunker.cc:272] TRITONBACKEND_ModelInstanceInitialize: spectrogram_chunker-English-US_0 (device 0)
I0228 15:54:13.989204 104 tts-postprocessor.cc:305] TRITONBACKEND_ModelInitialize: tts_postprocessor-English-US (version 1)
I0228 15:54:13.989819 104 backend_model.cc:303] model configuration:
{
“name”: “tts_postprocessor-English-US”,
“platform”: “”,
“backend”: “riva_tts_postprocessor”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “INPUT”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
1,
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “NUM_VALID_SAMPLES”,
“data_type”: “TYPE_INT32”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “Prosody_volume”,
“data_type”: “TYPE_FP32”,
“format”: “FORMAT_NONE”,
“dims”: [
-1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “OUTPUT”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“priority”: “PRIORITY_DEFAULT”,
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 100
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “tts_postprocessor-English-US_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“fade_length”: {
“string_value”: “256”
},
“use_denoiser”: {
“string_value”: “False”
},
“chunk_num_samples”: {
“string_value”: “40960”
},
“max_execution_batch_size”: {
“string_value”: “8”
},
“hop_length”: {
“string_value”: “256”
},
“num_samples_per_frame”: {
“string_value”: “512”
},
“max_chunk_size”: {
“string_value”: “131072”
},
“filter_length”: {
“string_value”: “1024”
},
“supports_volume”: {
“string_value”: “True”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: false
}
}
I0228 15:54:13.990123 104 model_lifecycle.cc:693] successfully loaded ‘riva-trt-riva-punctuation-en-US-nn-bert-base-uncased’ version 1
I0228 15:54:13.990223 104 model_lifecycle.cc:693] successfully loaded ‘spectrogram_chunker-English-US’ version 1
I0228 15:54:13.997934 104 tts-postprocessor.cc:307] TRITONBACKEND_ModelInstanceInitialize: tts_postprocessor-English-US_0 (device 0)
I0228 15:54:14.016305 104 tts-preprocessor.cc:337] TRITONBACKEND_ModelInitialize: tts_preprocessor-English-US (version 1)
I0228 15:54:14.016762 104 model_lifecycle.cc:693] successfully loaded ‘tts_postprocessor-English-US’ version 1
W0228 15:54:14.017117 104 tts-preprocessor.cc:284] Parameter abbreviation_path is deprecated
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0228 15:54:14.017204 125 preprocessor.cc:231] TTS character mapping loaded from /data/models/tts_preprocessor-English-US/1/mapping.txt
I0228 15:54:14.121136 125 preprocessor.cc:269] TTS phonetic mapping loaded from /data/models/tts_preprocessor-English-US/1/ipa_cmudict-0.7b_nv22.10.txt
I0228 15:54:14.121222 125 preprocessor.cc:282] Abbreviation mapping loaded from /data/models/tts_preprocessor-English-US/1/abbr.txt
I0228 15:54:14.121279 125 normalize.cc:52] Speech Class far file missing:/data/models/tts_preprocessor-English-US/1/speech_class.far
I0228 15:54:14.205318 125 preprocessor.cc:292] TTS normalizer loaded from /data/models/tts_preprocessor-English-US/1/
I0228 15:54:14.205426 104 backend_model.cc:303] model configuration:
{
“name”: “tts_preprocessor-English-US”,
“platform”: “”,
“backend”: “riva_tts_preprocessor”,
“version_policy”: {
“latest”: {
“num_versions”: 1
}
},
“max_batch_size”: 8,
“input”: [
{
“name”: “input_string”,
“data_type”: “TYPE_STRING”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
},
{
“name”: “speaker”,
“data_type”: “TYPE_INT64”,
“format”: “FORMAT_NONE”,
“dims”: [
1
],
“is_shape_tensor”: false,
“allow_ragged_batch”: false,
“optional”: false
}
],
“output”: [
{
“name”: “output”,
“data_type”: “TYPE_INT64”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_mask”,
“data_type”: “TYPE_FP32”,
“dims”: [
1,
400,
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_length”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “is_last_sentence”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “output_string”,
“data_type”: “TYPE_STRING”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “sentence_num”,
“data_type”: “TYPE_INT32”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “pitch”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “duration”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “volume”,
“data_type”: “TYPE_FP32”,
“dims”: [
-1
],
“label_filename”: “”,
“is_shape_tensor”: false
},
{
“name”: “speaker”,
“data_type”: “TYPE_INT64”,
“dims”: [
1
],
“label_filename”: “”,
“is_shape_tensor”: false
}
],
“batch_input”: ,
“batch_output”: ,
“optimization”: {
“graph”: {
“level”: 0
},
“priority”: “PRIORITY_DEFAULT”,
“cuda”: {
“graphs”: false,
“busy_wait_events”: false,
“graph_spec”: ,
“output_copy_stream”: true
},
“input_pinned_memory”: {
“enable”: true
},
“output_pinned_memory”: {
“enable”: true
},
“gather_kernel_buffer_threshold”: 0,
“eager_batching”: false
},
“sequence_batching”: {
“oldest”: {
“max_candidate_sequences”: 8,
“preferred_batch_size”: [
8
],
“max_queue_delay_microseconds”: 100
},
“max_sequence_idle_microseconds”: 60000000,
“control_input”: [
{
“name”: “START”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_START”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “READY”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_READY”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “END”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_END”,
“int32_false_true”: [
0,
1
],
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_INVALID”
}
]
},
{
“name”: “CORRID”,
“control”: [
{
“kind”: “CONTROL_SEQUENCE_CORRID”,
“int32_false_true”: ,
“fp32_false_true”: ,
“bool_false_true”: ,
“data_type”: “TYPE_UINT64”
}
]
}
],
“state”:
},
“instance_group”: [
{
“name”: “tts_preprocessor-English-US_0”,
“kind”: “KIND_GPU”,
“count”: 1,
“gpus”: [
0
],
“secondary_devices”: ,
“profile”: ,
“passive”: false,
“host_policy”: “”
}
],
“default_model_filename”: “”,
“cc_model_filenames”: {},
“metric_tags”: {},
“parameters”: {
“supports_speaker_mixing”: {
“string_value”: “False”
},
“norm_proto_path”: {
“string_value”: “/data/models/tts_preprocessor-English-US/1/”
},
“enable_emphasis_tag”: {
“string_value”: “True”
},
“abbreviations_path”: {
“string_value”: “/data/models/tts_preprocessor-English-US/1/abbr.txt”
},
“pad_with_space”: {
“string_value”: “True”
},
“supports_ragged_batches”: {
“string_value”: “True”
},
“start_of_emphasis_token”: {
“string_value”: “[”
},
“upper_case_chars”: {
“string_value”: “True”
},
“pitch_std”: {
“string_value”: “68.77673200611284”
},
“g2p_ignore_ambiguous”: {
“string_value”: “True”
},
“phone_set”: {
“string_value”: “ipa”
},
“mapping_path”: {
“string_value”: “/data/models/tts_preprocessor-English-US/1/mapping.txt”
},
“subvoices”: {
“string_value”: “Female-1:0,Male-1:1,Female-Neutral:2,Male-Neutral:3,Female-Angry:4,Male-Angry:5,Female-Calm:6,Male-Calm:7,Female-Fearful:10,Female-Happy:12,Male-Happy:13,Female-Sad:14”
},
“dictionary_path”: {
“string_value”: “/data/models/tts_preprocessor-English-US/1/ipa_cmudict-0.7b_nv22.10.txt”
},
“max_input_length”: {
“string_value”: “2000”
},
“max_sequence_length”: {
“string_value”: “400”
},
“language”: {
“string_value”: “en-US”
},
“end_of_emphasis_token”: {
“string_value”: “]”
},
“upper_case_g2p”: {
“string_value”: “True”
},
“normalize_pitch”: {
“string_value”: “True”
}
},
“model_warmup”: ,
“model_transaction_policy”: {
“decoupled”: true
}
}
I0228 15:54:14.205580 104 tts-preprocessor.cc:339] TRITONBACKEND_ModelInstanceInitialize: tts_preprocessor-English-US_0 (device 0)
I0228 15:54:14.206057 104 model_lifecycle.cc:693] successfully loaded ‘tts_preprocessor-English-US’ version 1
I0228 15:54:14.206685 104 model_lifecycle.cc:459] loading: conformer-en-US-asr-offline:1
I0228 15:54:14.206760 104 model_lifecycle.cc:459] loading: fastpitch_hifigan_ensemble-English-US:1
I0228 15:54:14.206929 104 model_lifecycle.cc:693] successfully loaded ‘conformer-en-US-asr-offline’ version 1
I0228 15:54:14.206988 104 model_lifecycle.cc:693] successfully loaded ‘fastpitch_hifigan_ensemble-English-US’ version 1
I0228 15:54:14.207084 104 server.cc:563]
±-----------------±-----+
| Repository Agent | Path |
±-----------------±-----+
±-----------------±-----+

I0228 15:54:14.207203 104 server.cc:590]
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_asr_endpointing | /opt/tritonserver/backends/riva_asr_endpointing/libtriton_riva_asr_endpointing.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_asr_decoder | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
| riva_tts_postprocessor | /opt/tritonserver/backends/riva_tts_postprocessor/libtriton_riva_tts_postprocessor.so | {“cmdline”:{“auto-complete-config”:“false”,“min-compute-capability”:“6.000000”,“backend-directory”:“/opt/tritonserver/backends”,“default-max-batch-size”:“4”}} |
±-----------------------±--------------------------------------------------------------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0228 15:54:14.207360 104 server.cc:633]
±----------------------------------------------------------------±--------±-------+
| Model | Version | Status |
±----------------------------------------------------------------±--------±-------+
| conformer-en-US-asr-offline | 1 | READY |
| conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline | 1 | READY |
| conformer-en-US-asr-offline-endpointing-streaming-offline | 1 | READY |
| conformer-en-US-asr-offline-feature-extractor-streaming-offline | 1 | READY |
| conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming | 1 | READY |
| conformer-en-US-asr-streaming-endpointing-streaming | 1 | READY |
| conformer-en-US-asr-streaming-feature-extractor-streaming | 1 | READY |
| fastpitch_hifigan_ensemble-English-US | 1 | READY |
| riva-onnx-fastpitch_encoder-English-US | 1 | READY |
| riva-punctuation-en-US | 1 | READY |
| riva-trt-conformer-en-US-asr-offline-am-streaming-offline | 1 | READY |
| riva-trt-hifigan-English-US | 1 | READY |
| riva-trt-riva-punctuation-en-US-nn-bert-base-uncased | 1 | READY |
| spectrogram_chunker-English-US | 1 | READY |
| tts_postprocessor-English-US | 1 | READY |
| tts_preprocessor-English-US | 1 | READY |
±----------------------------------------------------------------±--------±-------+

I0228 15:54:14.240540 104 metrics.cc:864] Collecting metrics for GPU 0: NVIDIA GeForce RTX 4060
I0228 15:54:14.240713 104 metrics.cc:757] Collecting CPU metrics
I0228 15:54:14.241451 104 tritonserver.cc:2264]
±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.27.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging |
| model_repository_path[0] | /data/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 1000000000 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
±---------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0228 15:54:14.241473 104 server.cc:264] Waiting for in-flight requests to complete.
I0228 15:54:14.241491 104 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences
I0228 15:54:14.242447 104 model_lifecycle.cc:578] successfully unloaded ‘conformer-en-US-asr-offline’ version 1
I0228 15:54:14.242991 104 tts-postprocessor.cc:310] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.243072 104 tensorrt.cc:5665] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.243285 104 ctc-decoder-library.cc:27] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.243533 104 endpointing_library.cc:28] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.244060 104 feature-extractor.cc:420] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.244791 104 pipeline_library.cc:31] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.245015 104 spectrogram-chunker.cc:275] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.245037 104 tensorrt.cc:5665] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.245085 104 spectrogram-chunker.cc:271] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.245221 104 ctc-decoder-library.cc:27] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.245398 104 endpointing_library.cc:28] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.245428 104 model_lifecycle.cc:578] successfully unloaded ‘spectrogram_chunker-English-US’ version 1
I0228 15:54:14.245728 104 tts-preprocessor.cc:342] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.245780 104 tts-preprocessor.cc:338] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.245790 104 feature-extractor.cc:420] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.246340 104 server.cc:295] All models are stopped, unloading models
I0228 15:54:14.246370 104 onnxruntime.cc:2640] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.246424 104 tensorrt.cc:5665] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0228 15:54:14.246472 104 server.cc:302] Timeout 30: Found 14 live models and 0 in-flight non-inference requests
I0228 15:54:14.250574 104 endpointing_library.cc:23] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.251210 104 model_lifecycle.cc:578] successfully unloaded ‘conformer-en-US-asr-offline-endpointing-streaming-offline’ version 1
I0228 15:54:14.252739 104 endpointing_library.cc:23] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.253209 104 model_lifecycle.cc:578] successfully unloaded ‘conformer-en-US-asr-streaming-endpointing-streaming’ version 1
I0228 15:54:14.255236 104 model_lifecycle.cc:578] successfully unloaded ‘fastpitch_hifigan_ensemble-English-US’ version 1
I0228 15:54:14.263458 104 pipeline_library.cc:27] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.265170 104 model_lifecycle.cc:578] successfully unloaded ‘riva-punctuation-en-US’ version 1
I0228 15:54:14.283743 104 model_lifecycle.cc:578] successfully unloaded ‘tts_preprocessor-English-US’ version 1
I0228 15:54:14.301737 104 tts-postprocessor.cc:306] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.301980 104 model_lifecycle.cc:578] successfully unloaded ‘tts_postprocessor-English-US’ version 1
I0228 15:54:14.343146 104 onnxruntime.cc:2586] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.343487 104 model_lifecycle.cc:578] successfully unloaded ‘riva-onnx-fastpitch_encoder-English-US’ version 1
I0228 15:54:14.365964 104 feature-extractor.cc:416] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.469173 104 ctc-decoder-library.cc:24] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.469819 104 model_lifecycle.cc:578] successfully unloaded ‘conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline’ version 1
I0228 15:54:14.479901 104 ctc-decoder-library.cc:24] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.479898 104 model_lifecycle.cc:578] successfully unloaded ‘conformer-en-US-asr-streaming-feature-extractor-streaming’ version 1
I0228 15:54:14.480373 104 model_lifecycle.cc:578] successfully unloaded ‘conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming’ version 1

Riva waiting for Triton server to load all models…retrying in 1 second
I0228 15:54:14.791865 104 tensorrt.cc:5604] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.804046 104 tensorrt.cc:5604] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.804100 104 model_lifecycle.cc:578] successfully unloaded ‘riva-trt-hifigan-English-US’ version 1
I0228 15:54:14.804203 104 model_lifecycle.cc:578] successfully unloaded ‘riva-trt-riva-punctuation-en-US-nn-bert-base-uncased’ version 1
I0228 15:54:14.807844 104 feature-extractor.cc:416] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.808709 104 model_lifecycle.cc:578] successfully unloaded ‘conformer-en-US-asr-offline-feature-extractor-streaming-offline’ version 1
I0228 15:54:14.890997 104 tensorrt.cc:5604] TRITONBACKEND_ModelFinalize: delete model state
I0228 15:54:14.891246 104 model_lifecycle.cc:578] successfully unloaded ‘riva-trt-conformer-en-US-asr-offline-am-streaming-offline’ version 1
W0228 15:54:15.242546 104 metrics.cc:621] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W0228 15:54:15.242593 104 metrics.cc:645] Unable to get energy consumption for GPU 0. Status:Success, value:0
I0228 15:54:15.246572 104 server.cc:302] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
Riva waiting for Triton server to load all models…retrying in 1 second
W
W
Riva waiting for Triton server to load all models…retrying in 1 second
Triton server died before reaching ready state. Terminating Riva startup.
Check Triton logs with: docker logs
/opt/riva/bin/start-riva: line 1: kill: (104) - No such process

Please help me!