========================== === Riva Speech Skills === ========================== NVIDIA Release 23.01 (build 52756634) Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license > Riva waiting for Triton server to load all models...retrying in 1 second Warning: '--strict-model-config' has been deprecated! Please use '--disable-auto-complete-config' instead. I0301 17:21:15.578101 111 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f0a8e000000' with size 268435456 I0301 17:21:15.579709 111 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000 I0301 17:21:15.588092 111 model_lifecycle.cc:459] loading: en_de_24x6:1 I0301 17:21:15.588124 111 model_lifecycle.cc:459] loading: en_de_24x6-classifier:1 I0301 17:21:15.588147 111 model_lifecycle.cc:459] loading: en_de_24x6-decoder:1 I0301 17:21:15.588167 111 model_lifecycle.cc:459] loading: en_de_24x6-encoder:1 I0301 17:21:15.588189 111 model_lifecycle.cc:459] loading: en_es_24x6:1 I0301 17:21:15.588207 111 model_lifecycle.cc:459] loading: en_es_24x6-classifier:1 I0301 17:21:15.588228 111 model_lifecycle.cc:459] loading: en_es_24x6-decoder:1 I0301 17:21:15.588247 111 model_lifecycle.cc:459] loading: en_es_24x6-encoder:1 I0301 17:21:15.588272 111 model_lifecycle.cc:459] loading: en_fr_24x6:1 I0301 17:21:15.588291 111 model_lifecycle.cc:459] loading: en_fr_24x6-classifier:1 I0301 17:21:15.588309 111 model_lifecycle.cc:459] loading: en_fr_24x6-decoder:1 I0301 17:21:15.588333 111 model_lifecycle.cc:459] loading: en_fr_24x6-encoder:1 I0301 17:21:15.588358 111 model_lifecycle.cc:459] loading: en_ru_24x6:1 I0301 17:21:15.588375 111 model_lifecycle.cc:459] loading: en_ru_24x6-classifier:1 I0301 17:21:15.588395 111 model_lifecycle.cc:459] loading: en_ru_24x6-decoder:1 I0301 17:21:15.588413 111 model_lifecycle.cc:459] loading: en_ru_24x6-encoder:1 I0301 17:21:15.588434 111 model_lifecycle.cc:459] loading: en_zh_24x6:1 I0301 17:21:15.588452 111 model_lifecycle.cc:459] loading: en_zh_24x6-classifier:1 I0301 17:21:15.588475 111 model_lifecycle.cc:459] loading: en_zh_24x6-decoder:1 I0301 17:21:15.588494 111 model_lifecycle.cc:459] loading: en_zh_24x6-encoder:1 I0301 17:21:15.588515 111 model_lifecycle.cc:459] loading: riva-onnx-fastpitch_encoder-English-US:1 I0301 17:21:15.588533 111 model_lifecycle.cc:459] loading: riva-trt-hifigan-English-US:1 I0301 17:21:15.588552 111 model_lifecycle.cc:459] loading: spectrogram_chunker-English-US:1 I0301 17:21:15.588574 111 model_lifecycle.cc:459] loading: tts_postprocessor-English-US:1 I0301 17:21:15.588597 111 model_lifecycle.cc:459] loading: tts_preprocessor-English-US:1 I0301 17:21:15.590745 111 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime I0301 17:21:15.590757 111 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10 I0301 17:21:15.590764 111 onnxruntime.cc:2475] 'onnxruntime' TRITONBACKEND API version: 1.10 I0301 17:21:15.590768 111 onnxruntime.cc:2505] backend configuration: {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} I0301 17:21:15.631562 111 tensorrt.cc:5444] TRITONBACKEND_Initialize: tensorrt I0301 17:21:15.631579 111 tensorrt.cc:5454] Triton TRITONBACKEND API version: 1.10 I0301 17:21:15.631584 111 tensorrt.cc:5460] 'tensorrt' TRITONBACKEND API version: 1.10 I0301 17:21:15.631587 111 tensorrt.cc:5488] backend configuration: {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} I0301 17:21:15.631740 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_de_24x6-classifier (version 1) I0301 17:21:15.632153 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_de_24x6-decoder (version 1) I0301 17:21:15.632595 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_de_24x6-encoder (version 1) I0301 17:21:15.633418 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_es_24x6-classifier (version 1) I0301 17:21:15.633784 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_es_24x6-decoder (version 1) I0301 17:21:15.634122 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_es_24x6-encoder (version 1) I0301 17:21:15.634760 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_fr_24x6-classifier (version 1) I0301 17:21:15.635083 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_fr_24x6-decoder (version 1) I0301 17:21:15.635394 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_fr_24x6-encoder (version 1) I0301 17:21:15.636106 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_ru_24x6-classifier (version 1) I0301 17:21:15.636385 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_ru_24x6-decoder (version 1) I0301 17:21:15.636708 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_ru_24x6-encoder (version 1) I0301 17:21:15.637383 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_zh_24x6-classifier (version 1) I0301 17:21:15.637693 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_zh_24x6-decoder (version 1) I0301 17:21:15.638002 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: en_zh_24x6-encoder (version 1) I0301 17:21:15.638333 111 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: riva-onnx-fastpitch_encoder-English-US (version 1) I0301 17:21:15.638702 111 python_be.cc:1856] TRITONBACKEND_ModelInstanceInitialize: en_de_24x6 (GPU device 0) I0301 17:21:16.247186 111 model_lifecycle.cc:693] successfully loaded 'en_de_24x6' version 1 I0301 17:21:16.247703 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_de_24x6-classifier_0 (GPU device 0) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:17.378042 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_de_24x6-decoder_0 (GPU device 0) I0301 17:21:17.378386 111 model_lifecycle.cc:693] successfully loaded 'en_de_24x6-classifier' version 1 2023-03-01 17:21:17.764383954 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:17.764429249 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. I0301 17:21:17.930116 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_de_24x6-encoder_0 (GPU device 0) I0301 17:21:17.930499 111 model_lifecycle.cc:693] successfully loaded 'en_de_24x6-decoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second 2023-03-01 17:21:19.128979388 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:19.129001820 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:19.490319 111 tensorrt.cc:5578] TRITONBACKEND_ModelInitialize: riva-trt-hifigan-English-US (version 1) I0301 17:21:19.490550 111 model_lifecycle.cc:693] successfully loaded 'en_de_24x6-encoder' version 1 I0301 17:21:19.491136 111 python_be.cc:1856] TRITONBACKEND_ModelInstanceInitialize: en_es_24x6 (GPU device 0) I0301 17:21:19.491139 111 backend_model.cc:188] Overriding execution policy to "TRITONBACKEND_EXECUTION_BLOCKING" for sequence model "riva-trt-hifigan-English-US" I0301 17:21:20.111783 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_es_24x6-classifier_0 (GPU device 0) I0301 17:21:20.112066 111 model_lifecycle.cc:693] successfully loaded 'en_es_24x6' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:20.406419 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_es_24x6-decoder_0 (GPU device 0) I0301 17:21:20.406693 111 model_lifecycle.cc:693] successfully loaded 'en_es_24x6-classifier' version 1 2023-03-01 17:21:20.907746717 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:20.907774930 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. I0301 17:21:21.104800 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_es_24x6-encoder_0 (GPU device 0) I0301 17:21:21.105218 111 model_lifecycle.cc:693] successfully loaded 'en_es_24x6-decoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second 2023-03-01 17:21:22.282497204 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:22.282522201 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:22.659847 111 python_be.cc:1856] TRITONBACKEND_ModelInstanceInitialize: en_fr_24x6 (GPU device 0) I0301 17:21:22.660159 111 model_lifecycle.cc:693] successfully loaded 'en_es_24x6-encoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:23.382485 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_fr_24x6-classifier_0 (GPU device 0) I0301 17:21:23.382781 111 model_lifecycle.cc:693] successfully loaded 'en_fr_24x6' version 1 I0301 17:21:23.656739 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_fr_24x6-decoder_0 (GPU device 0) I0301 17:21:23.657056 111 model_lifecycle.cc:693] successfully loaded 'en_fr_24x6-classifier' version 1 2023-03-01 17:21:24.172657722 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:24.172686486 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:24.368100 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_fr_24x6-encoder_0 (GPU device 0) I0301 17:21:24.368431 111 model_lifecycle.cc:693] successfully loaded 'en_fr_24x6-decoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second 2023-03-01 17:21:25.576950852 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:25.576970298 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. I0301 17:21:25.958969 111 python_be.cc:1856] TRITONBACKEND_ModelInstanceInitialize: en_ru_24x6 (GPU device 0) I0301 17:21:25.959297 111 model_lifecycle.cc:693] successfully loaded 'en_fr_24x6-encoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:26.653336 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_ru_24x6-classifier_0 (GPU device 0) I0301 17:21:26.653646 111 model_lifecycle.cc:693] successfully loaded 'en_ru_24x6' version 1 I0301 17:21:26.934547 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_ru_24x6-decoder_0 (GPU device 0) I0301 17:21:26.934861 111 model_lifecycle.cc:693] successfully loaded 'en_ru_24x6-classifier' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second 2023-03-01 17:21:27.449994191 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:27.450017806 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. I0301 17:21:27.646437 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_ru_24x6-encoder_0 (GPU device 0) I0301 17:21:27.646760 111 model_lifecycle.cc:693] successfully loaded 'en_ru_24x6-decoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second 2023-03-01 17:21:28.884439580 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:28.884462994 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. I0301 17:21:29.268641 111 python_be.cc:1856] TRITONBACKEND_ModelInstanceInitialize: en_zh_24x6 (GPU device 0) I0301 17:21:29.269057 111 model_lifecycle.cc:693] successfully loaded 'en_ru_24x6-encoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:30.038862 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_zh_24x6-classifier_0 (GPU device 0) I0301 17:21:30.039645 111 model_lifecycle.cc:693] successfully loaded 'en_zh_24x6' version 1 I0301 17:21:30.316710 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_zh_24x6-decoder_0 (GPU device 0) I0301 17:21:30.317024 111 model_lifecycle.cc:693] successfully loaded 'en_zh_24x6-classifier' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second 2023-03-01 17:21:30.827115054 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:30.827144139 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. I0301 17:21:31.031636 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: en_zh_24x6-encoder_0 (GPU device 0) I0301 17:21:31.031955 111 model_lifecycle.cc:693] successfully loaded 'en_zh_24x6-decoder' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second 2023-03-01 17:21:32.239722082 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:32.239742260 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:32.619906 111 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: riva-onnx-fastpitch_encoder-English-US_0 (GPU device 0) I0301 17:21:32.620230 111 model_lifecycle.cc:693] successfully loaded 'en_zh_24x6-encoder' version 1 2023-03-01 17:21:32.991348388 [W:onnxruntime:, session_state.cc:1030 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2023-03-01 17:21:32.991369287 [W:onnxruntime:, session_state.cc:1032 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments. I0301 17:21:33.044015 111 model_lifecycle.cc:693] successfully loaded 'riva-onnx-fastpitch_encoder-English-US' version 1 I0301 17:21:33.044768 111 spectrogram-chunker.cc:274] TRITONBACKEND_ModelInitialize: spectrogram_chunker-English-US (version 1) I0301 17:21:33.045381 111 backend_model.cc:303] model configuration: { "name": "spectrogram_chunker-English-US", "platform": "", "backend": "riva_tts_chunker", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "SPECTROGRAM", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ 80, -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IS_LAST_SENTENCE", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "NUM_VALID_FRAMES_IN", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SENTENCE_NUM", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "DURATIONS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "PROCESSED_TEXT", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "VOLUME", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SPECTROGRAM_CHUNK", "data_type": "TYPE_FP32", "dims": [ 80, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "END_FLAG", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "NUM_VALID_SAMPLES_OUT", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SENTENCE_NUM", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "DURATIONS", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PROCESSED_TEXT", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "VOLUME", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 8, "preferred_batch_size": [ 8 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "spectrogram_chunker-English-US_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "max_execution_batch_size": { "string_value": "8" }, "num_mels": { "string_value": "80" }, "chunk_length": { "string_value": "80" }, "supports_volume": { "string_value": "True" }, "num_samples_per_frame": { "string_value": "512" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": true } } I0301 17:21:33.045423 111 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-hifigan-English-US_0 (GPU device 0) > Riva waiting for Triton server to load all models...retrying in 1 second I0301 17:21:33.473303 111 logging.cc:49] Loaded engine size: 28 MiB I0301 17:21:33.579805 111 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +28, now: CPU 0, GPU 28 (MiB) I0301 17:21:33.586155 111 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +182, now: CPU 0, GPU 210 (MiB) W0301 17:21:33.586171 111 logging.cc:46] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars I0301 17:21:33.586455 111 tensorrt.cc:1547] Created instance riva-trt-hifigan-English-US_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I0301 17:21:33.587165 111 model_lifecycle.cc:693] successfully loaded 'riva-trt-hifigan-English-US' version 1 I0301 17:21:33.589143 111 spectrogram-chunker.cc:276] TRITONBACKEND_ModelInstanceInitialize: spectrogram_chunker-English-US_0 (device 0) I0301 17:21:33.589195 111 tts-postprocessor.cc:300] TRITONBACKEND_ModelInitialize: tts_postprocessor-English-US (version 1) I0301 17:21:33.589447 111 model_lifecycle.cc:693] successfully loaded 'spectrogram_chunker-English-US' version 1 I0301 17:21:33.589912 111 backend_model.cc:303] model configuration: { "name": "tts_postprocessor-English-US", "platform": "", "backend": "riva_tts_postprocessor", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "INPUT", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ 1, -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "NUM_VALID_SAMPLES", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "Prosody_volume", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "OUTPUT", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 8, "preferred_batch_size": [ 8 ], "max_queue_delay_microseconds": 100 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "tts_postprocessor-English-US_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "chunk_num_samples": { "string_value": "40960" }, "fade_length": { "string_value": "256" }, "use_denoiser": { "string_value": "False" }, "max_execution_batch_size": { "string_value": "8" }, "num_samples_per_frame": { "string_value": "512" }, "filter_length": { "string_value": "1024" }, "supports_volume": { "string_value": "True" }, "hop_length": { "string_value": "256" }, "max_chunk_size": { "string_value": "131072" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I0301 17:21:33.589998 111 tts-postprocessor.cc:302] TRITONBACKEND_ModelInstanceInitialize: tts_postprocessor-English-US_0 (device 0) I0301 17:21:33.612302 111 tts-preprocessor.cc:280] TRITONBACKEND_ModelInitialize: tts_preprocessor-English-US (version 1) I0301 17:21:33.612579 111 model_lifecycle.cc:693] successfully loaded 'tts_postprocessor-English-US' version 1 W0301 17:21:33.613051 111 tts-preprocessor.cc:241] Parameter abbreviation_path is deprecated WARNING: Logging before InitGoogleLogging() is written to STDERR I0301 17:21:33.613122 141 preprocessor.cc:206] TTS character mapping loaded from /data/models/tts_preprocessor-English-US/1/mapping.txt I0301 17:21:33.699532 141 preprocessor.cc:243] TTS phonetic mapping loaded from /data/models/tts_preprocessor-English-US/1/ipa_cmudict-0.7b_nv22.08.txt I0301 17:21:33.699605 141 preprocessor.cc:256] Abbreviation mapping loaded from /data/models/tts_preprocessor-English-US/1/abbr.txt W0301 17:21:33.699627 141 normalize.cc:52] Speech Class far file missing:/data/models/tts_preprocessor-English-US/1/speech_class.far I0301 17:21:33.779708 141 preprocessor.cc:266] TTS normalizer loaded from /data/models/tts_preprocessor-English-US/1/ I0301 17:21:33.779825 111 backend_model.cc:303] model configuration: { "name": "tts_preprocessor-English-US", "platform": "", "backend": "riva_tts_preprocessor", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "input_string", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "speaker", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "output", "data_type": "TYPE_INT64", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "output_mask", "data_type": "TYPE_FP32", "dims": [ 1, 400, 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "output_length", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "is_last_sentence", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "output_string", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "sentence_num", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "pitch", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "duration", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "volume", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "speaker", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 8, "preferred_batch_size": [ 8 ], "max_queue_delay_microseconds": 100 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "tts_preprocessor-English-US_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "phone_set": { "string_value": "ipa" }, "max_sequence_length": { "string_value": "400" }, "max_input_length": { "string_value": "2000" }, "start_of_emphasis_token": { "string_value": "[" }, "language": { "string_value": "en-US" }, "g2p_ignore_ambiguous": { "string_value": "True" }, "upper_case_chars": { "string_value": "True" }, "abbreviations_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/abbr.txt" }, "pad_with_space": { "string_value": "True" }, "mapping_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/mapping.txt" }, "dictionary_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/ipa_cmudict-0.7b_nv22.08.txt" }, "end_of_emphasis_token": { "string_value": "]" }, "supports_ragged_batches": { "string_value": "True" }, "norm_proto_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/" }, "enable_emphasis_tag": { "string_value": "True" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": true } } I0301 17:21:33.779912 111 tts-preprocessor.cc:282] TRITONBACKEND_ModelInstanceInitialize: tts_preprocessor-English-US_0 (device 0) I0301 17:21:33.780493 111 model_lifecycle.cc:693] successfully loaded 'tts_preprocessor-English-US' version 1 I0301 17:21:33.780970 111 model_lifecycle.cc:459] loading: fastpitch_hifigan_ensemble-English-US:1 I0301 17:21:33.781168 111 model_lifecycle.cc:693] successfully loaded 'fastpitch_hifigan_ensemble-English-US' version 1 I0301 17:21:33.781274 111 server.cc:563] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+ I0301 17:21:33.781348 111 server.cc:590] +------------------------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Backend | Path | Config | +------------------------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | riva_tts_postprocessor | /opt/tritonserver/backends/riva_tts_postprocessor/libtriton_riva_tts_postprocessor.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | +------------------------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ I0301 17:21:33.781467 111 server.cc:633] +----------------------------------------+---------+--------+ | Model | Version | Status | +----------------------------------------+---------+--------+ | en_de_24x6 | 1 | READY | | en_de_24x6-classifier | 1 | READY | | en_de_24x6-decoder | 1 | READY | | en_de_24x6-encoder | 1 | READY | | en_es_24x6 | 1 | READY | | en_es_24x6-classifier | 1 | READY | | en_es_24x6-decoder | 1 | READY | | en_es_24x6-encoder | 1 | READY | | en_fr_24x6 | 1 | READY | | en_fr_24x6-classifier | 1 | READY | | en_fr_24x6-decoder | 1 | READY | | en_fr_24x6-encoder | 1 | READY | | en_ru_24x6 | 1 | READY | | en_ru_24x6-classifier | 1 | READY | | en_ru_24x6-decoder | 1 | READY | | en_ru_24x6-encoder | 1 | READY | | en_zh_24x6 | 1 | READY | | en_zh_24x6-classifier | 1 | READY | | en_zh_24x6-decoder | 1 | READY | | en_zh_24x6-encoder | 1 | READY | | fastpitch_hifigan_ensemble-English-US | 1 | READY | | riva-onnx-fastpitch_encoder-English-US | 1 | READY | | riva-trt-hifigan-English-US | 1 | READY | | spectrogram_chunker-English-US | 1 | READY | | tts_postprocessor-English-US | 1 | READY | | tts_preprocessor-English-US | 1 | READY | +----------------------------------------+---------+--------+ I0301 17:21:33.848081 111 metrics.cc:864] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB I0301 17:21:33.848560 111 metrics.cc:757] Collecting CPU metrics I0301 17:21:33.848727 111 tritonserver.cc:2264] +----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.27.0 | | server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging | | model_repository_path[0] | /data/models | | model_control_mode | MODE_NONE | | strict_model_config | 1 | | rate_limit | OFF | | pinned_memory_pool_byte_size | 268435456 | | cuda_memory_pool_byte_size{0} | 1000000000 | | response_cache_byte_size | 0 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | +----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ I0301 17:21:33.849818 111 grpc_server.cc:4819] Started GRPCInferenceService at 0.0.0.0:8001 I0301 17:21:33.850003 111 http_server.cc:3474] Started HTTPService at 0.0.0.0:8000 I0301 17:21:33.891369 111 http_server.cc:181] Started Metrics Service at 0.0.0.0:8002 > Triton server is ready... I0301 17:21:34.450122 1760 riva_server.cc:125] Using Insecure Server Credentials I0301 17:21:34.462869 1760 model_registry.cc:120] Successfully registered: fastpitch_hifigan_ensemble-English-US for TTS I0301 17:21:34.466687 1760 model_registry.cc:120] Successfully registered: en_de_24x6 for NMT I0301 17:21:34.468581 1760 model_registry.cc:120] Successfully registered: en_es_24x6 for NMT I0301 17:21:34.470445 1760 model_registry.cc:120] Successfully registered: en_fr_24x6 for NMT I0301 17:21:34.472291 1760 model_registry.cc:120] Successfully registered: en_ru_24x6 for NMT I0301 17:21:34.474117 1760 model_registry.cc:120] Successfully registered: en_zh_24x6 for NMT I0301 17:21:34.479175 1760 riva_server.cc:171] Riva Conversational AI Server listening on 0.0.0.0:50051 W0301 17:21:34.479184 1760 stats_reporter.cc:41] No API key provided. Stats reporting disabled. I0301 17:32:02.745432 1765 grpc_riva_tts.cc:301] TTSService.Synthesize called. I0301 17:33:04.661661 1765 grpc_riva_tts.cc:301] TTSService.Synthesize called. I0301 17:36:48.372066 1782 grpc_riva_nmt.cc:85] NMT->TranslateText Requested: rmir_nmt_en_ru_24x6:2.9.0 with en -> ru 1ce1180a E0301 17:36:48.372087 1782 grpc_riva_nmt.cc:106] Model not found rmir_nmt_en_ru_24x6:2.9.0 I0301 17:41:04.594254 1782 grpc_riva_nmt.cc:85] NMT->TranslateText Requested: rmir_nmt_en_ru_24x6 with en -> ru 6196b10d E0301 17:41:04.594276 1782 grpc_riva_nmt.cc:106] Model not found rmir_nmt_en_ru_24x6 I0301 17:41:47.629882 1765 grpc_riva_nmt.cc:85] NMT->TranslateText Requested: rmir_nmt_en_ru_24x6 with en -> ru 6380bba4 E0301 17:41:47.629904 1765 grpc_riva_nmt.cc:106] Model not found rmir_nmt_en_ru_24x6 I0301 17:45:48.573861 1765 grpc_riva_nmt.cc:85] NMT->TranslateText Requested: en_ru_24x6 with en -> ru 557006c4 error: NMT model inference failure: Failed to process the request(s) for model instance 'en_ru_24x6', message: AttributeError: 'NoneType' object has no attribute 'is_cpu' At: /data/models/en_ru_24x6/1/model.py(178): run /data/models/en_ru_24x6/1/model.py(326): run /data/models/en_ru_24x6/1/model.py(420): execute I0301 17:45:48.748729 1765 grpc_riva_nmt.cc:122] NMT->TranslateText Completed: en_ru_24x6 with en -> ru 557006c4 I0301 17:49:29.815793 1765 grpc_riva_nmt.cc:85] NMT->TranslateText Requested: en_ru_24x6 with en -> ru 1afd2626 error: NMT model inference failure: Failed to process the request(s) for model instance 'en_ru_24x6', message: AttributeError: 'NoneType' object has no attribute 'is_cpu' At: /data/models/en_ru_24x6/1/model.py(178): run /data/models/en_ru_24x6/1/model.py(326): run /data/models/en_ru_24x6/1/model.py(420): execute I0301 17:49:29.820472 1765 grpc_riva_nmt.cc:122] NMT->TranslateText Completed: en_ru_24x6 with en -> ru 1afd2626 I0301 17:56:11.771831 1782 grpc_riva_nmt.cc:85] NMT->TranslateText Requested: en_es_24x6 with en -> es 5c8ab8fc error: NMT model inference failure: Failed to process the request(s) for model instance 'en_es_24x6', message: AttributeError: 'NoneType' object has no attribute 'is_cpu' At: /data/models/en_es_24x6/1/model.py(178): run /data/models/en_es_24x6/1/model.py(326): run /data/models/en_es_24x6/1/model.py(420): execute I0301 17:56:11.938158 1782 grpc_riva_nmt.cc:122] NMT->TranslateText Completed: en_es_24x6 with en -> es 5c8ab8fc