========================== === Riva Speech Skills === ========================== NVIDIA Release (build 46434648) Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:34.848770 86 libtorch.cc:1381] TRITONBACKEND_Initialize: pytorch I1120 13:43:34.849350 86 libtorch.cc:1391] Triton TRITONBACKEND API version: 1.9 I1120 13:43:34.849564 86 libtorch.cc:1397] 'pytorch' TRITONBACKEND API version: 1.9 I1120 13:43:34.894096 86 onnxruntime.cc:2400] TRITONBACKEND_Initialize: onnxruntime I1120 13:43:34.894598 86 onnxruntime.cc:2410] Triton TRITONBACKEND API version: 1.9 I1120 13:43:34.894953 86 onnxruntime.cc:2416] 'onnxruntime' TRITONBACKEND API version: 1.9 I1120 13:43:34.895262 86 onnxruntime.cc:2446] backend configuration: {} > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:37.723201 86 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f69ee000000' with size 268435456 I1120 13:43:37.723673 86 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000 > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:37.758927 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-endpointing-streaming:1 I1120 13:43:37.859566 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming:1 I1120 13:43:37.862660 86 endpointing_library.cc:18] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-endpointing-streaming (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W1120 13:43:37.866254 105 parameter_parser.cc:144] Parameter 'chunk_size' set but unused. W1120 13:43:37.866314 105 parameter_parser.cc:144] Parameter 'ms_per_timestep' set but unused. W1120 13:43:37.866349 105 parameter_parser.cc:144] Parameter 'residue_blanks_at_end' set but unused. W1120 13:43:37.866358 105 parameter_parser.cc:144] Parameter 'residue_blanks_at_start' set but unused. W1120 13:43:37.866365 105 parameter_parser.cc:144] Parameter 'start_history' set but unused. W1120 13:43:37.866372 105 parameter_parser.cc:144] Parameter 'start_th' set but unused. W1120 13:43:37.866379 105 parameter_parser.cc:144] Parameter 'stop_history' set but unused. W1120 13:43:37.866386 105 parameter_parser.cc:144] Parameter 'stop_th' set but unused. W1120 13:43:37.866394 105 parameter_parser.cc:144] Parameter 'streaming' set but unused. W1120 13:43:37.866400 105 parameter_parser.cc:144] Parameter 'use_subword' set but unused. W1120 13:43:37.866406 105 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. I1120 13:43:37.867470 86 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-endpointing-streaming", "platform": "", "backend": "riva_asr_endpointing", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 2048, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-endpointing-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "start_th": { "string_value": "0.2" }, "chunk_size": { "string_value": "0.16" }, "endpointing_type": { "string_value": "greedy_ctc" }, "stop_th": { "string_value": "0.98" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-endpointing-streaming/1/riva_decoder_vocabulary.txt" }, "start_history": { "string_value": "200" }, "ms_per_timestep": { "string_value": "40" }, "residue_blanks_at_start": { "string_value": "-2" }, "streaming": { "string_value": "True" }, "use_subword": { "string_value": "True" }, "stop_history": { "string_value": "800" }, "residue_blanks_at_end": { "string_value": "0" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:43:37.867926 86 endpointing_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-endpointing-streaming_0 (device 0) I1120 13:43:37.960809 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline-feature-extractor-streaming-offline:1 I1120 13:43:37.991953 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-endpointing-streaming' version 1 I1120 13:43:38.016760 86 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W1120 13:43:38.019518 106 parameter_parser.cc:144] Parameter 'beam_size' set but unused. W1120 13:43:38.019572 106 parameter_parser.cc:144] Parameter 'beam_size_token' set but unused. W1120 13:43:38.019579 106 parameter_parser.cc:144] Parameter 'beam_threshold' set but unused. W1120 13:43:38.019587 106 parameter_parser.cc:144] Parameter 'blank_token' set but unused. W1120 13:43:38.019593 106 parameter_parser.cc:144] Parameter 'decoder_num_worker_threads' set but unused. W1120 13:43:38.019601 106 parameter_parser.cc:144] Parameter 'forerunner_beam_size' set but unused. W1120 13:43:38.019608 106 parameter_parser.cc:144] Parameter 'forerunner_beam_size_token' set but unused. W1120 13:43:38.019615 106 parameter_parser.cc:144] Parameter 'forerunner_beam_threshold' set but unused. W1120 13:43:38.019621 106 parameter_parser.cc:144] Parameter 'forerunner_use_lm' set but unused. W1120 13:43:38.019629 106 parameter_parser.cc:144] Parameter 'language_model_file' set but unused. W1120 13:43:38.019635 106 parameter_parser.cc:144] Parameter 'lexicon_file' set but unused. W1120 13:43:38.019642 106 parameter_parser.cc:144] Parameter 'lm_weight' set but unused. W1120 13:43:38.019649 106 parameter_parser.cc:144] Parameter 'log_add' set but unused. W1120 13:43:38.019656 106 parameter_parser.cc:144] Parameter 'max_execution_batch_size' set but unused. W1120 13:43:38.019663 106 parameter_parser.cc:144] Parameter 'max_supported_transcripts' set but unused. W1120 13:43:38.019670 106 parameter_parser.cc:144] Parameter 'num_tokenization' set but unused. W1120 13:43:38.019677 106 parameter_parser.cc:144] Parameter 'profane_words_file' set but unused. W1120 13:43:38.019685 106 parameter_parser.cc:144] Parameter 'set_default_index_to_unk_token' set but unused. W1120 13:43:38.019690 106 parameter_parser.cc:144] Parameter 'sil_token' set but unused. W1120 13:43:38.019697 106 parameter_parser.cc:144] Parameter 'smearing_mode' set but unused. W1120 13:43:38.019704 106 parameter_parser.cc:144] Parameter 'tokenizer_model' set but unused. W1120 13:43:38.019711 106 parameter_parser.cc:144] Parameter 'unk_score' set but unused. W1120 13:43:38.019718 106 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1120 13:43:38.019726 106 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. W1120 13:43:38.019732 106 parameter_parser.cc:144] Parameter 'word_insertion_score' set but unused. I1120 13:43:38.021615 86 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming", "platform": "", "backend": "riva_asr_decoder", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "END_FLAG", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "CUSTOM_CONFIGURATION", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "FINAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_TRANSCRIPTS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS_STABILITY", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 32, 64 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "forerunner_beam_size": { "string_value": "8" }, "unk_score": { "string_value": "-inf" }, "max_supported_transcripts": { "string_value": "1" }, "chunk_size": { "string_value": "0.16" }, "lexicon_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt" }, "smearing_mode": { "string_value": "max" }, "log_add": { "string_value": "True" }, "blank_token": { "string_value": "#" }, "lm_weight": { "string_value": "0.8" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt" }, "ms_per_timestep": { "string_value": "40" }, "streaming": { "string_value": "True" }, "use_subword": { "string_value": "True" }, "beam_size": { "string_value": "32" }, "right_padding_size": { "string_value": "1.92" }, "beam_size_token": { "string_value": "16" }, "sil_token": { "string_value": "▁" }, "num_tokenization": { "string_value": "1" }, "beam_threshold": { "string_value": "20.0" }, "tokenizer_model": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/8b8f095152034e98b24ab33726708bd0_tokenizer.model" }, "language_model_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/4gram-pruned-0_2_7_9-en-lm-set-2.0.bin" }, "max_execution_batch_size": { "string_value": "1024" }, "forerunner_use_lm": { "string_value": "true" }, "profane_words_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/profane_words_file.txt" }, "forerunner_beam_size_token": { "string_value": "8" }, "forerunner_beam_threshold": { "string_value": "10.0" }, "asr_model_delay": { "string_value": "-1" }, "decoder_num_worker_threads": { "string_value": "-1" }, "word_insertion_score": { "string_value": "1.0" }, "unk_token": { "string_value": "" }, "left_padding_size": { "string_value": "1.92" }, "set_default_index_to_unk_token": { "string_value": "False" }, "decoder_type": { "string_value": "flashlight" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:43:38.022155 86 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0) I1120 13:43:38.061599 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline-endpointing-streaming-offline:1 I1120 13:43:38.162235 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline:1 I1120 13:43:38.262947 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-feature-extractor-streaming:1 I1120 13:43:38.363536 86 model_repository_manager.cc:1077] loading: intent_slot_detokenizer:1 I1120 13:43:38.464156 86 model_repository_manager.cc:1077] loading: intent_slot_label_tokens_weather:1 I1120 13:43:38.564931 86 model_repository_manager.cc:1077] loading: intent_slot_tokenizer-en-US-weather:1 I1120 13:43:38.665562 86 model_repository_manager.cc:1077] loading: qa_qa_postprocessor:1 > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:38.767759 86 model_repository_manager.cc:1077] loading: qa_tokenizer-en-US:1 I1120 13:43:38.868272 86 model_repository_manager.cc:1077] loading: riva-onnx-fastpitch_encoder-English-US:1 I1120 13:43:38.968790 86 model_repository_manager.cc:1077] loading: riva-punctuation-en-US:1 I1120 13:43:39.069319 86 model_repository_manager.cc:1077] loading: riva-trt-conformer-en-US-asr-offline-am-streaming-offline:1 I1120 13:43:39.170026 86 model_repository_manager.cc:1077] loading: riva-trt-conformer-en-US-asr-streaming-am-streaming:1 I1120 13:43:39.270593 86 model_repository_manager.cc:1077] loading: riva-trt-hifigan-English-US:1 I1120 13:43:39.371143 86 model_repository_manager.cc:1077] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1 I1120 13:43:39.471707 86 model_repository_manager.cc:1077] loading: riva-trt-riva_intent_weather-nn-bert-base-uncased:1 I1120 13:43:39.572422 86 model_repository_manager.cc:1077] loading: riva-trt-riva_ner-nn-bert-base-uncased:1 I1120 13:43:39.673015 86 model_repository_manager.cc:1077] loading: riva-trt-riva_qa-nn-bert-base-uncased:1 I1120 13:43:39.773445 86 model_repository_manager.cc:1077] loading: riva-trt-riva_text_classification_domain-nn-bert-base-uncased:1 > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:39.873996 86 model_repository_manager.cc:1077] loading: spectrogram_chunker-English-US:1 I1120 13:43:39.974535 86 model_repository_manager.cc:1077] loading: text_classification_tokenizer-en-US-domain:1 I1120 13:43:40.075222 86 model_repository_manager.cc:1077] loading: token_classification_detokenizer:1 I1120 13:43:40.175758 86 model_repository_manager.cc:1077] loading: token_classification_label_tokens:1 I1120 13:43:40.276435 86 model_repository_manager.cc:1077] loading: token_classification_tokenizer-en-US:1 I1120 13:43:40.377041 86 model_repository_manager.cc:1077] loading: tts_postprocessor-English-US:1 I1120 13:43:40.477834 86 model_repository_manager.cc:1077] loading: tts_preprocessor-English-US:1 > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:40.890015 106 ctc-decoder.cc:174] Beam Decoder initialized successfully! I1120 13:43:40.891242 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1 I1120 13:43:40.892946 86 feature-extractor.cc:400] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-offline-feature-extractor-streaming-offline (version 1) I1120 13:43:40.925257 86 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-offline-feature-extractor-streaming-offline", "platform": "", "backend": "riva_asr_features", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 512, "input": [ { "name": "AUDIO_SIGNAL", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SAMPLE_RATE", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "AUDIO_FEATURES", "data_type": "TYPE_FP32", "dims": [ 80, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_PROCESSED", "data_type": "TYPE_FP32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_FEATURES_LENGTH", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 512, "preferred_batch_size": [ 256, 512 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-offline-feature-extractor-streaming-offline_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "streaming": { "string_value": "True" }, "transpose": { "string_value": "False" }, "left_padding_size": { "string_value": "1.6" }, "stddev_floor": { "string_value": "1e-05" }, "right_padding_size": { "string_value": "1.6" }, "gain": { "string_value": "1.0" }, "use_utterance_norm_params": { "string_value": "False" }, "precalc_norm_time_steps": { "string_value": "0" }, "dither": { "string_value": "0.0" }, "apply_normalization": { "string_value": "True" }, "precalc_norm_params": { "string_value": "False" }, "norm_per_feature": { "string_value": "True" }, "mean": { "string_value": "-11.4412, -9.9334, -9.1292, -9.0365, -9.2804, -9.5643, -9.7342, -9.6925, -9.6333, -9.2808, -9.1887, -9.1422, -9.1397, -9.2028, -9.2749, -9.4776, -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734, -9.9257, -9.6557, -9.1761, -9.6653, -9.7876, -9.7230, -9.7792, -9.7056, -9.2702, -9.4650, -9.2755, -9.1369, -9.1174, -8.9197, -8.5394, -8.2614, -8.1353, -8.1422, -8.3430, -8.6655" }, "stddev": { "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181" }, "chunk_size": { "string_value": "4.8" }, "max_execution_batch_size": { "string_value": "512" }, "sample_rate": { "string_value": "16000" }, "window_size": { "string_value": "0.025" }, "num_features": { "string_value": "80" }, "window_stride": { "string_value": "0.01" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:43:40.925554 86 endpointing_library.cc:18] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-offline-endpointing-streaming-offline (version 1) W1120 13:43:40.927836 131 parameter_parser.cc:144] Parameter 'chunk_size' set but unused. W1120 13:43:40.927917 131 parameter_parser.cc:144] Parameter 'ms_per_timestep' set but unused. W1120 13:43:40.927933 131 parameter_parser.cc:144] Parameter 'residue_blanks_at_end' set but unused. W1120 13:43:40.927945 131 parameter_parser.cc:144] Parameter 'residue_blanks_at_start' set but unused. W1120 13:43:40.927956 131 parameter_parser.cc:144] Parameter 'start_history' set but unused. W1120 13:43:40.927966 131 parameter_parser.cc:144] Parameter 'start_th' set but unused. W1120 13:43:40.927976 131 parameter_parser.cc:144] Parameter 'stop_history' set but unused. W1120 13:43:40.927987 131 parameter_parser.cc:144] Parameter 'stop_th' set but unused. W1120 13:43:40.927999 131 parameter_parser.cc:144] Parameter 'streaming' set but unused. W1120 13:43:40.928010 131 parameter_parser.cc:144] Parameter 'use_subword' set but unused. W1120 13:43:40.928021 131 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. I1120 13:43:40.929474 86 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-offline-endpointing-streaming-offline", "platform": "", "backend": "riva_asr_endpointing", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 2048, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-offline-endpointing-streaming-offline_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "residue_blanks_at_start": { "string_value": "-2" }, "ms_per_timestep": { "string_value": "40" }, "streaming": { "string_value": "True" }, "use_subword": { "string_value": "True" }, "stop_history": { "string_value": "800" }, "residue_blanks_at_end": { "string_value": "0" }, "start_th": { "string_value": "0.2" }, "chunk_size": { "string_value": "4.8" }, "endpointing_type": { "string_value": "greedy_ctc" }, "stop_th": { "string_value": "0.98" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-offline-endpointing-streaming-offline/1/riva_decoder_vocabulary.txt" }, "start_history": { "string_value": "200" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:43:40.929618 86 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline (version 1) W1120 13:43:40.931679 132 parameter_parser.cc:144] Parameter 'beam_size' set but unused. W1120 13:43:40.931702 132 parameter_parser.cc:144] Parameter 'beam_size_token' set but unused. W1120 13:43:40.931710 132 parameter_parser.cc:144] Parameter 'beam_threshold' set but unused. W1120 13:43:40.931716 132 parameter_parser.cc:144] Parameter 'blank_token' set but unused. W1120 13:43:40.931723 132 parameter_parser.cc:144] Parameter 'decoder_num_worker_threads' set but unused. W1120 13:43:40.931731 132 parameter_parser.cc:144] Parameter 'forerunner_beam_size' set but unused. W1120 13:43:40.931737 132 parameter_parser.cc:144] Parameter 'forerunner_beam_size_token' set but unused. W1120 13:43:40.931744 132 parameter_parser.cc:144] Parameter 'forerunner_beam_threshold' set but unused. W1120 13:43:40.931751 132 parameter_parser.cc:144] Parameter 'forerunner_use_lm' set but unused. W1120 13:43:40.931757 132 parameter_parser.cc:144] Parameter 'language_model_file' set but unused. W1120 13:43:40.931764 132 parameter_parser.cc:144] Parameter 'lexicon_file' set but unused. W1120 13:43:40.931771 132 parameter_parser.cc:144] Parameter 'lm_weight' set but unused. W1120 13:43:40.931778 132 parameter_parser.cc:144] Parameter 'log_add' set but unused. W1120 13:43:40.931785 132 parameter_parser.cc:144] Parameter 'max_execution_batch_size' set but unused. W1120 13:43:40.931792 132 parameter_parser.cc:144] Parameter 'max_supported_transcripts' set but unused. W1120 13:43:40.931798 132 parameter_parser.cc:144] Parameter 'num_tokenization' set but unused. W1120 13:43:40.931805 132 parameter_parser.cc:144] Parameter 'profane_words_file' set but unused. W1120 13:43:40.931813 132 parameter_parser.cc:144] Parameter 'set_default_index_to_unk_token' set but unused. W1120 13:43:40.931819 132 parameter_parser.cc:144] Parameter 'sil_token' set but unused. W1120 13:43:40.931825 132 parameter_parser.cc:144] Parameter 'smearing_mode' set but unused. W1120 13:43:40.931833 132 parameter_parser.cc:144] Parameter 'tokenizer_model' set but unused. W1120 13:43:40.931839 132 parameter_parser.cc:144] Parameter 'unk_score' set but unused. W1120 13:43:40.931846 132 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1120 13:43:40.931854 132 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. W1120 13:43:40.931859 132 parameter_parser.cc:144] Parameter 'word_insertion_score' set but unused. I1120 13:43:40.933444 86 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline", "platform": "", "backend": "riva_asr_decoder", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "END_FLAG", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "CUSTOM_CONFIGURATION", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "FINAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_TRANSCRIPTS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS_STABILITY", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 32, 64 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "max_execution_batch_size": { "string_value": "1024" }, "forerunner_use_lm": { "string_value": "true" }, "profane_words_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/profane_words_file.txt" }, "forerunner_beam_size_token": { "string_value": "8" }, "forerunner_beam_threshold": { "string_value": "10.0" }, "asr_model_delay": { "string_value": "-1" }, "decoder_num_worker_threads": { "string_value": "-1" }, "word_insertion_score": { "string_value": "1.0" }, "unk_token": { "string_value": "" }, "left_padding_size": { "string_value": "1.6" }, "set_default_index_to_unk_token": { "string_value": "False" }, "decoder_type": { "string_value": "flashlight" }, "forerunner_beam_size": { "string_value": "8" }, "unk_score": { "string_value": "-inf" }, "max_supported_transcripts": { "string_value": "1" }, "chunk_size": { "string_value": "4.8" }, "lexicon_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/lexicon.txt" }, "smearing_mode": { "string_value": "max" }, "log_add": { "string_value": "True" }, "blank_token": { "string_value": "#" }, "lm_weight": { "string_value": "0.8" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/riva_decoder_vocabulary.txt" }, "ms_per_timestep": { "string_value": "40" }, "use_subword": { "string_value": "True" }, "streaming": { "string_value": "True" }, "beam_size": { "string_value": "32" }, "right_padding_size": { "string_value": "1.6" }, "beam_size_token": { "string_value": "16" }, "sil_token": { "string_value": "▁" }, "num_tokenization": { "string_value": "1" }, "beam_threshold": { "string_value": "20.0" }, "language_model_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/4gram-pruned-0_2_7_9-en-lm-set-2.0.bin" }, "tokenizer_model": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/8b8f095152034e98b24ab33726708bd0_tokenizer.model" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:43:40.933581 86 feature-extractor.cc:400] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-feature-extractor-streaming (version 1) I1120 13:43:40.935657 86 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-feature-extractor-streaming", "platform": "", "backend": "riva_asr_features", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "AUDIO_SIGNAL", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SAMPLE_RATE", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "AUDIO_FEATURES", "data_type": "TYPE_FP32", "dims": [ 80, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_PROCESSED", "data_type": "TYPE_FP32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_FEATURES_LENGTH", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 256, 512 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-feature-extractor-streaming_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "window_stride": { "string_value": "0.01" }, "num_features": { "string_value": "80" }, "window_size": { "string_value": "0.025" }, "streaming": { "string_value": "True" }, "left_padding_size": { "string_value": "1.92" }, "stddev_floor": { "string_value": "1e-05" }, "transpose": { "string_value": "False" }, "right_padding_size": { "string_value": "1.92" }, "gain": { "string_value": "1.0" }, "use_utterance_norm_params": { "string_value": "False" }, "precalc_norm_time_steps": { "string_value": "0" }, "precalc_norm_params": { "string_value": "False" }, "apply_normalization": { "string_value": "True" }, "dither": { "string_value": "0.0" }, "norm_per_feature": { "string_value": "True" }, "mean": { "string_value": "-11.4412, -9.9334, -9.1292, -9.0365, -9.2804, -9.5643, -9.7342, -9.6925, -9.6333, -9.2808, -9.1887, -9.1422, -9.1397, -9.2028, -9.2749, -9.4776, -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734, -9.9257, -9.6557, -9.1761, -9.6653, -9.7876, -9.7230, -9.7792, -9.7056, -9.2702, -9.4650, -9.2755, -9.1369, -9.1174, -8.9197, -8.5394, -8.2614, -8.1353, -8.1422, -8.3430, -8.6655" }, "stddev": { "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181" }, "chunk_size": { "string_value": "0.16" }, "max_execution_batch_size": { "string_value": "1024" }, "sample_rate": { "string_value": "16000" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:43:40.937856 86 detokenizer_cbe.cc:145] TRITONBACKEND_ModelInitialize: intent_slot_detokenizer (version 1) I1120 13:43:40.938998 86 backend_model.cc:303] model configuration: { "name": "intent_slot_detokenizer", "platform": "", "backend": "riva_nlp_detokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "IN_TOKEN_LABELS__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOKEN_SCORES__1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_SEQ_LEN__2", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOK_STR__3", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "OUT_TOKEN_LABELS__0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOKEN_SCORES__1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_SEQ_LEN__2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_detokenizer_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": {}, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:43:40.939152 86 endpointing_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-offline-endpointing-streaming-offline_0 (device 0) I1120 13:43:41.051994 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline-endpointing-streaming-offline' version 1 I1120 13:43:41.056403 86 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline_0 (device 0) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:43.842038 132 ctc-decoder.cc:174] Beam Decoder initialized successfully! I1120 13:43:43.842158 86 feature-extractor.cc:402] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-feature-extractor-streaming_0 (device 0) I1120 13:43:43.842974 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:45.651917 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-feature-extractor-streaming' version 1 I1120 13:43:45.656606 86 sequence_label_cbe.cc:137] TRITONBACKEND_ModelInitialize: intent_slot_label_tokens_weather (version 1) I1120 13:43:45.657782 86 backend_model.cc:303] model configuration: { "name": "intent_slot_label_tokens_weather", "platform": "", "backend": "riva_nlp_seqlabel", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "TOKEN_LOGIT__1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 65 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "TOKEN_LABELS__0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOKEN_SCORES__1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_label_tokens_weather_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "classes": { "string_value": "/data/models/intent_slot_label_tokens_weather/1/slot_labels.csv" } }, "model_warmup": [] } I1120 13:43:45.657904 86 detokenizer_cbe.cc:147] TRITONBACKEND_ModelInstanceInitialize: intent_slot_detokenizer_0 (device 0) I1120 13:43:45.658016 86 feature-extractor.cc:402] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-offline-feature-extractor-streaming-offline_0 (device 0) I1120 13:43:45.658286 86 model_repository_manager.cc:1231] successfully loaded 'intent_slot_detokenizer' version 1 I1120 13:43:45.673415 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline-feature-extractor-streaming-offline' version 1 I1120 13:43:45.719694 86 sequence_label_cbe.cc:139] TRITONBACKEND_ModelInstanceInitialize: intent_slot_label_tokens_weather_0 (device 0) I1120 13:43:45.720205 86 model_repository_manager.cc:1231] successfully loaded 'intent_slot_label_tokens_weather' version 1 I1120 13:43:45.753166 86 qa_postprocessor_cbe.cc:124] TRITONBACKEND_ModelInitialize: qa_qa_postprocessor (version 1) I1120 13:43:45.754240 86 backend_model.cc:303] model configuration: { "name": "qa_qa_postprocessor", "platform": "", "backend": "riva_nlp_qa", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "QA_LOGITS__0", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ 384, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEQ_LEN__1", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "TOK_STR__2", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 384 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "TOK_TO_ORIG__3", "data_type": "TYPE_UINT16", "format": "FORMAT_NONE", "dims": [ 384 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_PASSAGE_STR__4", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "ANSWER_SPANS__0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "ANSWER_SCORES__1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "qa_qa_postprocessor_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "version_2_with_negative": { "string_value": "True" }, "n_best_size": { "string_value": "20" }, "max_answer_length": { "string_value": "30" }, "bert_model_seq_length": { "string_value": "384" } }, "model_warmup": [] } I1120 13:43:45.754317 86 onnxruntime.cc:2481] TRITONBACKEND_ModelInitialize: riva-onnx-fastpitch_encoder-English-US (version 1) I1120 13:43:45.755484 86 tokenizer_library.cc:18] TRITONBACKEND_ModelInitialize: qa_tokenizer-en-US (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W1120 13:43:45.756749 141 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1120 13:43:45.756794 141 parameter_parser.cc:144] Parameter 'vocab' set but unused. I1120 13:43:45.757007 86 backend_model.cc:303] model configuration: { "name": "qa_tokenizer-en-US", "platform": "", "backend": "riva_nlp_tokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "IN_QUERY_STR__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_PASSAGE_STR__1", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEQ__0", "data_type": "TYPE_INT32", "dims": [ 384 ], "label_filename": "", "is_shape_tensor": false }, { "name": "MASK__1", "data_type": "TYPE_INT32", "dims": [ 384 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEQ_LEN__2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ 384 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEGMENT__4", "data_type": "TYPE_INT32", "dims": [ 384 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOK_TO_ORIG__5", "data_type": "TYPE_UINT16", "dims": [ 384 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "qa_tokenizer-en-US_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "bos_token": { "string_value": "[CLS]" }, "max_query_length": { "string_value": "64" }, "eos_token": { "string_value": "[SEP]" }, "to_lower": { "string_value": "true" }, "pad_chars_with_spaces": { "string_value": "False" }, "task": { "string_value": "qa" }, "doc_stride": { "string_value": "128" }, "unk_token": { "string_value": "[UNK]" }, "tokenizer": { "string_value": "wordpiece" }, "vocab": { "string_value": "/data/models/qa_tokenizer-en-US/1/tokenizer.vocab_file" } }, "model_warmup": [] } I1120 13:43:45.794223 86 tensorrt.cc:5294] TRITONBACKEND_Initialize: tensorrt I1120 13:43:45.794263 86 tensorrt.cc:5304] Triton TRITONBACKEND API version: 1.9 I1120 13:43:45.794278 86 tensorrt.cc:5310] 'tensorrt' TRITONBACKEND API version: 1.9 I1120 13:43:45.794400 86 tensorrt.cc:5353] backend configuration: {} I1120 13:43:45.794489 86 onnxruntime.cc:2524] TRITONBACKEND_ModelInstanceInitialize: riva-onnx-fastpitch_encoder-English-US_0 (GPU device 0) > Riva waiting for Triton server to load all models...retrying in 1 second 2022-11-20 13:43:46.393223306 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '418'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393255283 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '490'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393262336 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '375'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393268429 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '346'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393273907 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '354'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393279130 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '307'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393284298 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '379'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393289459 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '373'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393296277 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '301'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393301463 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '286'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393307855 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '447'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393312847 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '358'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393318318 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '281'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393339439 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '274'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393345175 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '374'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393350288 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '181'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393356308 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '303'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393361703 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '302'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393367301 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '426'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393372698 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '425'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393378138 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '430'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393383245 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '282'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393388376 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '497'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393393336 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '445'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393398121 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '451'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393403246 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '498'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393408473 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '502'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393413323 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '446'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393418205 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '353'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393423895 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '518'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393428721 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '519'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393433635 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '517'. It is not used by any node and should be removed from the model. 2022-11-20 13:43:46.393438377 [W:onnxruntime:, graph.cc:3526 CleanUnusedInitializersAndNodeArgs] Removing initializer '523'. It is not used by any node and should be removed from the model. > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:47.340671 86 qa_postprocessor_cbe.cc:126] TRITONBACKEND_ModelInstanceInitialize: qa_qa_postprocessor_0 (device 0) I1120 13:43:47.340817 86 tokenizer_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: qa_tokenizer-en-US_0 (device 0) I1120 13:43:47.341168 86 model_repository_manager.cc:1231] successfully loaded 'riva-onnx-fastpitch_encoder-English-US' version 1 I1120 13:43:47.341187 86 model_repository_manager.cc:1231] successfully loaded 'qa_qa_postprocessor' version 1 I1120 13:43:47.379678 86 tokenizer_library.cc:18] TRITONBACKEND_ModelInitialize: intent_slot_tokenizer-en-US-weather (version 1) I1120 13:43:47.379853 86 model_repository_manager.cc:1231] successfully loaded 'qa_tokenizer-en-US' version 1 W1120 13:43:47.381434 136 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1120 13:43:47.381479 136 parameter_parser.cc:144] Parameter 'vocab' set but unused. I1120 13:43:47.381632 86 backend_model.cc:303] model configuration: { "name": "intent_slot_tokenizer-en-US-weather", "platform": "", "backend": "riva_nlp_tokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "INPUT_STR__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEQ__0", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "MASK__1", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEGMENT__4", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEQ_LEN__2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_tokenizer-en-US-weather_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "unk_token": { "string_value": "[UNK]" }, "vocab": { "string_value": "/data/models/intent_slot_tokenizer-en-US-weather/1/tokenizer.vocab_file" }, "tokenizer": { "string_value": "wordpiece" }, "bos_token": { "string_value": "[CLS]" }, "eos_token": { "string_value": "[SEP]" }, "to_lower": { "string_value": "true" }, "pad_chars_with_spaces": { "string_value": "False" }, "task": { "string_value": "single_input" } }, "model_warmup": [] } I1120 13:43:47.381829 86 pipeline_library.cc:22] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W1120 13:43:47.383203 143 parameter_parser.cc:144] Parameter 'attn_mask_tensor_name' set but unused. W1120 13:43:47.383255 143 parameter_parser.cc:144] Parameter 'bos_token' set but unused. W1120 13:43:47.383265 143 parameter_parser.cc:144] Parameter 'capit_logits_tensor_name' set but unused. W1120 13:43:47.383271 143 parameter_parser.cc:144] Parameter 'capitalization_mapping_path' set but unused. W1120 13:43:47.383278 143 parameter_parser.cc:144] Parameter 'delimiter' set but unused. W1120 13:43:47.383285 143 parameter_parser.cc:144] Parameter 'eos_token' set but unused. W1120 13:43:47.383292 143 parameter_parser.cc:144] Parameter 'input_ids_tensor_name' set but unused. W1120 13:43:47.383298 143 parameter_parser.cc:144] Parameter 'language_code' set but unused. W1120 13:43:47.383306 143 parameter_parser.cc:144] Parameter 'model_api' set but unused. W1120 13:43:47.383313 143 parameter_parser.cc:144] Parameter 'model_family' set but unused. W1120 13:43:47.383319 143 parameter_parser.cc:144] Parameter 'pad_chars_with_spaces' set but unused. W1120 13:43:47.383363 143 parameter_parser.cc:144] Parameter 'preserve_accents' set but unused. W1120 13:43:47.383371 143 parameter_parser.cc:144] Parameter 'punct_logits_tensor_name' set but unused. W1120 13:43:47.383378 143 parameter_parser.cc:144] Parameter 'punctuation_mapping_path' set but unused. W1120 13:43:47.383385 143 parameter_parser.cc:144] Parameter 'remove_spaces' set but unused. W1120 13:43:47.383391 143 parameter_parser.cc:144] Parameter 'to_lower' set but unused. W1120 13:43:47.383399 143 parameter_parser.cc:144] Parameter 'token_type_tensor_name' set but unused. W1120 13:43:47.383405 143 parameter_parser.cc:144] Parameter 'tokenizer_to_lower' set but unused. W1120 13:43:47.383412 143 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1120 13:43:47.383419 143 parameter_parser.cc:144] Parameter 'use_int64_nn_inputs' set but unused. W1120 13:43:47.383426 143 parameter_parser.cc:144] Parameter 'vocab' set but unused. W1120 13:43:47.383572 143 parameter_parser.cc:144] Parameter 'model_api' set but unused. W1120 13:43:47.383582 143 parameter_parser.cc:144] Parameter 'model_family' set but unused. I1120 13:43:47.383702 86 backend_model.cc:303] model configuration: { "name": "riva-punctuation-en-US", "platform": "", "backend": "riva_nlp_pipeline", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "PIPELINE_INPUT", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "PIPELINE_OUTPUT", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "riva-punctuation-en-US_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "punct_logits_tensor_name": { "string_value": "punct_logits" }, "language_code": { "string_value": "en-US" }, "input_ids_tensor_name": { "string_value": "input_ids" }, "model_name": { "string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased" }, "tokenizer_to_lower": { "string_value": "true" }, "vocab": { "string_value": "/data/models/riva-punctuation-en-US/1/e222f352288a423da453a79b96cc7b75_vocab.txt" }, "capit_logits_tensor_name": { "string_value": "capit_logits" }, "eos_token": { "string_value": "[SEP]" }, "pipeline_type": { "string_value": "punctuation" }, "capitalization_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/fb06800834e74de1bdc32db51da9619c_capit_label_ids.csv" }, "token_type_tensor_name": { "string_value": "token_type_ids" }, "tokenizer": { "string_value": "wordpiece" }, "delimiter": { "string_value": " " }, "pad_chars_with_spaces": { "string_value": "False" }, "remove_spaces": { "string_value": "False" }, "use_int64_nn_inputs": { "string_value": "False" }, "preserve_accents": { "string_value": "false" }, "model_family": { "string_value": "riva" }, "unk_token": { "string_value": "[UNK]" }, "bos_token": { "string_value": "[CLS]" }, "punctuation_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/15eace99434b4c87ba28cbd294b48f43_punct_label_ids.csv" }, "model_api": { "string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText" }, "to_lower": { "string_value": "true" }, "load_model": { "string_value": "false" }, "attn_mask_tensor_name": { "string_value": "attention_mask" } }, "model_warmup": [] } I1120 13:43:47.383821 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-conformer-en-US-asr-offline-am-streaming-offline (version 1) I1120 13:43:47.385221 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-conformer-en-US-asr-streaming-am-streaming (version 1) I1120 13:43:47.386719 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-conformer-en-US-asr-streaming-am-streaming_0 (GPU device 0) > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:48.566518 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +449, GPU +0, now: CPU 3066, GPU 7613 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:51.725720 86 logging.cc:49] Loaded engine size: 383 MiB I1120 13:43:52.068296 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3966, GPU 7912 (MiB) I1120 13:43:52.070044 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 3966, GPU 7922 (MiB) I1120 13:43:52.071176 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +281, now: CPU 0, GPU 281 (MiB) I1120 13:43:52.128041 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3198, GPU 7913 (MiB) I1120 13:43:52.129355 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +1, GPU +8, now: CPU 3199, GPU 7921 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:54.323530 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +693, now: CPU 0, GPU 974 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:56.829033 86 tensorrt.cc:1411] Created instance riva-trt-conformer-en-US-asr-streaming-am-streaming_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:43:56.829172 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased (version 1) I1120 13:43:56.829509 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-conformer-en-US-asr-streaming-am-streaming' version 1 I1120 13:43:56.830928 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-riva_intent_weather-nn-bert-base-uncased (version 1) I1120 13:43:56.832370 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-riva_ner-nn-bert-base-uncased (version 1) I1120 13:43:56.833908 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-riva_qa-nn-bert-base-uncased (version 1) I1120 13:43:56.835368 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-riva_text_classification_domain-nn-bert-base-uncased (version 1) I1120 13:43:56.838612 86 tokenizer_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: intent_slot_tokenizer-en-US-weather_0 (device 0) I1120 13:43:56.884640 86 pipeline_library.cc:25] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0) I1120 13:43:56.884895 86 model_repository_manager.cc:1231] successfully loaded 'intent_slot_tokenizer-en-US-weather' version 1 I1120 13:43:56.921378 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 (GPU device 0) I1120 13:43:56.921603 86 model_repository_manager.cc:1231] successfully loaded 'riva-punctuation-en-US' version 1 I1120 13:43:56.925439 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 4569, GPU 10174 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:57.761299 86 logging.cc:49] Loaded engine size: 367 MiB I1120 13:43:58.036622 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 5418, GPU 10468 (MiB) I1120 13:43:58.039047 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 5418, GPU 10478 (MiB) I1120 13:43:58.040123 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +282, now: CPU 0, GPU 1256 (MiB) I1120 13:43:58.089121 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 4684, GPU 10470 (MiB) I1120 13:43:58.091127 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 4684, GPU 10478 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:43:59.592001 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +362, now: CPU 0, GPU 1618 (MiB) I1120 13:43:59.837072 86 tensorrt.cc:1411] Created instance riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:43:59.837274 86 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-hifigan-English-US (version 1) I1120 13:43:59.838199 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 (GPU device 0) I1120 13:43:59.838280 86 backend_model.cc:181] Overriding execution policy to "TRITONBACKEND_EXECUTION_BLOCKING" for sequence model "riva-trt-hifigan-English-US" I1120 13:43:59.838636 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-conformer-en-US-asr-offline-am-streaming-offline' version 1 I1120 13:43:59.841900 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 4825, GPU 11360 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:05.126551 86 logging.cc:49] Loaded engine size: 833 MiB > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:05.642238 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 6909, GPU 11786 (MiB) I1120 13:44:05.644727 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +12, now: CPU 6909, GPU 11798 (MiB) I1120 13:44:05.645931 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +416, now: CPU 0, GPU 2034 (MiB) I1120 13:44:05.770732 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 5243, GPU 11788 (MiB) I1120 13:44:05.772771 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 5243, GPU 11798 (MiB) I1120 13:44:06.546171 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +39, now: CPU 0, GPU 2073 (MiB) I1120 13:44:06.546582 86 tensorrt.cc:1411] Created instance riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:44:06.546668 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva_intent_weather-nn-bert-base-uncased_0 (GPU device 0) I1120 13:44:06.547105 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1 I1120 13:44:06.549452 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 5260, GPU 11842 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:08.087533 86 logging.cc:49] Loaded engine size: 208 MiB I1120 13:44:08.326982 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 5792, GPU 12184 (MiB) I1120 13:44:08.329667 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 5792, GPU 12194 (MiB) I1120 13:44:08.331025 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +95, now: CPU 0, GPU 2168 (MiB) I1120 13:44:08.375819 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 5375, GPU 12186 (MiB) I1120 13:44:08.377925 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 5375, GPU 12194 (MiB) I1120 13:44:08.509549 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +108, now: CPU 0, GPU 2276 (MiB) I1120 13:44:08.509960 86 tensorrt.cc:1411] Created instance riva-trt-riva_intent_weather-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:44:08.510072 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva_ner-nn-bert-base-uncased_0 (GPU device 0) I1120 13:44:08.510703 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-riva_intent_weather-nn-bert-base-uncased' version 1 I1120 13:44:08.514009 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 5489, GPU 12446 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:09.721675 86 logging.cc:49] Loaded engine size: 209 MiB I1120 13:44:09.948156 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 6022, GPU 12788 (MiB) I1120 13:44:09.950973 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +1, GPU +10, now: CPU 6023, GPU 12798 (MiB) I1120 13:44:09.952378 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +96, now: CPU 0, GPU 2372 (MiB) I1120 13:44:09.990012 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1, GPU +8, now: CPU 5604, GPU 12790 (MiB) I1120 13:44:09.992260 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 5604, GPU 12798 (MiB) I1120 13:44:10.126222 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +109, now: CPU 0, GPU 2481 (MiB) I1120 13:44:10.126695 86 tensorrt.cc:1411] Created instance riva-trt-riva_ner-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:44:10.126944 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva_qa-nn-bert-base-uncased_0 (GPU device 0) I1120 13:44:10.127183 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-riva_ner-nn-bert-base-uncased' version 1 I1120 13:44:10.131007 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 5717, GPU 13050 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:11.237256 86 logging.cc:49] Loaded engine size: 208 MiB I1120 13:44:11.472499 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 6248, GPU 13392 (MiB) I1120 13:44:11.475111 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 6248, GPU 13402 (MiB) I1120 13:44:11.476469 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +1, GPU +94, now: CPU 1, GPU 2575 (MiB) I1120 13:44:11.510281 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 5832, GPU 13394 (MiB) I1120 13:44:11.512474 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 5832, GPU 13402 (MiB) I1120 13:44:11.639476 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +127, now: CPU 1, GPU 2702 (MiB) I1120 13:44:11.639908 86 tensorrt.cc:1411] Created instance riva-trt-riva_qa-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:44:11.639950 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva_text_classification_domain-nn-bert-base-uncased_0 (GPU device 0) I1120 13:44:11.642365 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 5945, GPU 13672 (MiB) I1120 13:44:11.653007 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-riva_qa-nn-bert-base-uncased' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:13.149450 86 logging.cc:49] Loaded engine size: 209 MiB I1120 13:44:13.433674 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 6479, GPU 14014 (MiB) I1120 13:44:13.436443 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 6479, GPU 14024 (MiB) I1120 13:44:13.437821 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +96, now: CPU 1, GPU 2798 (MiB) I1120 13:44:13.471911 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 6060, GPU 14016 (MiB) I1120 13:44:13.474590 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 6060, GPU 14024 (MiB) I1120 13:44:13.596705 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +108, now: CPU 1, GPU 2906 (MiB) I1120 13:44:13.597157 86 tensorrt.cc:1411] Created instance riva-trt-riva_text_classification_domain-nn-bert-base-uncased_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:44:13.597310 86 spectrogram-chunker.cc:274] TRITONBACKEND_ModelInitialize: spectrogram_chunker-English-US (version 1) I1120 13:44:13.597571 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-riva_text_classification_domain-nn-bert-base-uncased' version 1 I1120 13:44:13.599496 86 backend_model.cc:303] model configuration: { "name": "spectrogram_chunker-English-US", "platform": "", "backend": "riva_tts_chunker", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "SPECTROGRAM", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ 80, -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IS_LAST_SENTENCE", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "NUM_VALID_FRAMES_IN", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SENTENCE_NUM", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "DURATIONS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "PROCESSED_TEXT", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "VOLUME", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SPECTROGRAM_CHUNK", "data_type": "TYPE_FP32", "dims": [ 80, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "END_FLAG", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "NUM_VALID_SAMPLES_OUT", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SENTENCE_NUM", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "DURATIONS", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PROCESSED_TEXT", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "VOLUME", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 8, "preferred_batch_size": [ 8 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "spectrogram_chunker-English-US_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "num_mels": { "string_value": "80" }, "supports_volume": { "string_value": "True" }, "num_samples_per_frame": { "string_value": "512" }, "max_execution_batch_size": { "string_value": "8" }, "chunk_length": { "string_value": "80" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": true } } I1120 13:44:13.600024 86 spectrogram-chunker.cc:276] TRITONBACKEND_ModelInstanceInitialize: spectrogram_chunker-English-US_0 (device 0) I1120 13:44:13.600211 86 tokenizer_library.cc:18] TRITONBACKEND_ModelInitialize: text_classification_tokenizer-en-US-domain (version 1) W1120 13:44:13.601258 156 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1120 13:44:13.601280 156 parameter_parser.cc:144] Parameter 'vocab' set but unused. I1120 13:44:13.601464 86 backend_model.cc:303] model configuration: { "name": "text_classification_tokenizer-en-US-domain", "platform": "", "backend": "riva_nlp_tokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "INPUT_STR__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEQ__0", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "MASK__1", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEGMENT__4", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEQ_LEN__2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "text_classification_tokenizer-en-US-domain_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "unk_token": { "string_value": "[UNK]" }, "vocab": { "string_value": "/data/models/text_classification_tokenizer-en-US-domain/1/tokenizer.vocab_file" }, "tokenizer": { "string_value": "wordpiece" }, "bos_token": { "string_value": "[CLS]" }, "eos_token": { "string_value": "[SEP]" }, "to_lower": { "string_value": "true" }, "pad_chars_with_spaces": { "string_value": "False" }, "task": { "string_value": "single_input" } }, "model_warmup": [] } I1120 13:44:13.601822 86 detokenizer_cbe.cc:145] TRITONBACKEND_ModelInitialize: token_classification_detokenizer (version 1)I1120 13:44:13.601823 86 model_repository_manager.cc:1231] successfully loaded 'spectrogram_chunker-English-US' version 1 I1120 13:44:13.603048 86 backend_model.cc:303] model configuration: { "name": "token_classification_detokenizer", "platform": "", "backend": "riva_nlp_detokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "IN_TOKEN_LABELS__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOKEN_SCORES__1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_SEQ_LEN__2", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOK_STR__3", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "OUT_TOKEN_LABELS__0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOKEN_SCORES__1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_SEQ_LEN__2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "token_classification_detokenizer_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": {}, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:44:13.603499 86 detokenizer_cbe.cc:147] TRITONBACKEND_ModelInstanceInitialize: token_classification_detokenizer_0 (device 0) I1120 13:44:13.603733 86 sequence_label_cbe.cc:137] TRITONBACKEND_ModelInitialize: token_classification_label_tokens (version 1) I1120 13:44:13.604828 86 backend_model.cc:303] model configuration: { "name": "token_classification_label_tokens", "platform": "", "backend": "riva_nlp_seqlabel", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "TOKEN_LOGIT__1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 13 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "TOKEN_LABELS__0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOKEN_SCORES__1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "token_classification_label_tokens_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "classes": { "string_value": "/data/models/token_classification_label_tokens/1/label_ids.csv" } }, "model_warmup": [] } I1120 13:44:13.604974 86 tokenizer_library.cc:18] TRITONBACKEND_ModelInitialize: token_classification_tokenizer-en-US (version 1) I1120 13:44:13.605450 86 model_repository_manager.cc:1231] successfully loaded 'token_classification_detokenizer' version 1 W1120 13:44:13.605986 159 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1120 13:44:13.606005 159 parameter_parser.cc:144] Parameter 'vocab' set but unused. I1120 13:44:13.606118 86 backend_model.cc:303] model configuration: { "name": "token_classification_tokenizer-en-US", "platform": "", "backend": "riva_nlp_tokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "INPUT_STR__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEQ__0", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "MASK__1", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEGMENT__4", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEQ_LEN__2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "token_classification_tokenizer-en-US_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "unk_token": { "string_value": "[UNK]" }, "vocab": { "string_value": "/data/models/token_classification_tokenizer-en-US/1/tokenizer.vocab_file" }, "tokenizer": { "string_value": "wordpiece" }, "bos_token": { "string_value": "[CLS]" }, "eos_token": { "string_value": "[SEP]" }, "to_lower": { "string_value": "true" }, "pad_chars_with_spaces": { "string_value": "False" }, "task": { "string_value": "single_input" } }, "model_warmup": [] } I1120 13:44:13.614996 86 tts-postprocessor.cc:300] TRITONBACKEND_ModelInitialize: tts_postprocessor-English-US (version 1) I1120 13:44:13.619976 86 backend_model.cc:303] model configuration: { "name": "tts_postprocessor-English-US", "platform": "", "backend": "riva_tts_postprocessor", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "INPUT", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ 1, -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "NUM_VALID_SAMPLES", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "Prosody_volume", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "OUTPUT", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 8, "preferred_batch_size": [ 8 ], "max_queue_delay_microseconds": 100 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "tts_postprocessor-English-US_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "hop_length": { "string_value": "256" }, "supports_volume": { "string_value": "True" }, "filter_length": { "string_value": "1024" }, "num_samples_per_frame": { "string_value": "512" }, "use_denoiser": { "string_value": "False" }, "chunk_num_samples": { "string_value": "40960" }, "fade_length": { "string_value": "256" }, "max_chunk_size": { "string_value": "131072" }, "max_execution_batch_size": { "string_value": "8" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1120 13:44:13.620234 86 tokenizer_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: text_classification_tokenizer-en-US-domain_0 (device 0) I1120 13:44:13.658676 86 sequence_label_cbe.cc:139] TRITONBACKEND_ModelInstanceInitialize: token_classification_label_tokens_0 (device 0) I1120 13:44:13.658790 86 tokenizer_library.cc:21] TRITONBACKEND_ModelInstanceInitialize: token_classification_tokenizer-en-US_0 (device 0) I1120 13:44:13.659032 86 model_repository_manager.cc:1231] successfully loaded 'text_classification_tokenizer-en-US-domain' version 1 I1120 13:44:13.659178 86 model_repository_manager.cc:1231] successfully loaded 'token_classification_label_tokens' version 1 I1120 13:44:13.688669 86 model_repository_manager.cc:1231] successfully loaded 'token_classification_tokenizer-en-US' version 1 I1120 13:44:13.733731 86 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-hifigan-English-US_0 (GPU device 0) I1120 13:44:13.738390 86 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 6181, GPU 14276 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:13.989495 86 logging.cc:49] Loaded engine size: 32 MiB I1120 13:44:14.045390 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 6255, GPU 14318 (MiB) I1120 13:44:14.048102 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 6255, GPU 14328 (MiB) I1120 13:44:14.049593 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +31, now: CPU 1, GPU 2937 (MiB) I1120 13:44:14.056281 86 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 6189, GPU 14320 (MiB) I1120 13:44:14.058605 86 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +1, GPU +8, now: CPU 6190, GPU 14328 (MiB) I1120 13:44:14.060115 86 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +246, now: CPU 1, GPU 3183 (MiB) I1120 13:44:14.061176 86 tensorrt.cc:1411] Created instance riva-trt-hifigan-English-US_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1120 13:44:14.061264 86 tts-postprocessor.cc:302] TRITONBACKEND_ModelInstanceInitialize: tts_postprocessor-English-US_0 (device 0) I1120 13:44:14.061970 86 model_repository_manager.cc:1231] successfully loaded 'riva-trt-hifigan-English-US' version 1 I1120 13:44:14.089934 86 tts-preprocessor.cc:280] TRITONBACKEND_ModelInitialize: tts_preprocessor-English-US (version 1) I1120 13:44:14.090733 86 model_repository_manager.cc:1231] successfully loaded 'tts_postprocessor-English-US' version 1 W1120 13:44:14.092556 86 tts-preprocessor.cc:241] Parameter abbreviation_path is deprecated WARNING: Logging before InitGoogleLogging() is written to STDERR I1120 13:44:14.093619 161 preprocessor.cc:232] TTS character mapping loaded from /data/models/tts_preprocessor-English-US/1/mapping.txt I1120 13:44:14.290267 161 preprocessor.cc:269] TTS phonetic mapping loaded from /data/models/tts_preprocessor-English-US/1/18d619cc5aaf458bbda98fe02588e6a1_cmudict-0.7b_nv22.01 I1120 13:44:14.290927 161 preprocessor.cc:282] Abbreviation mapping loaded from /data/models/tts_preprocessor-English-US/1/abbr.txt I1120 13:44:14.413909 161 preprocessor.cc:292] TTS normalizer loaded from /data/models/tts_preprocessor-English-US/1/ I1120 13:44:14.414050 86 backend_model.cc:303] model configuration: { "name": "tts_preprocessor-English-US", "platform": "", "backend": "riva_tts_preprocessor", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "input_string", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "speaker", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "output", "data_type": "TYPE_INT64", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "output_mask", "data_type": "TYPE_FP32", "dims": [ 1, 400, 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "output_length", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "is_last_sentence", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "output_string", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "sentence_num", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "pitch", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "duration", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "volume", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "speaker", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 8, "preferred_batch_size": [ 8 ], "max_queue_delay_microseconds": 100 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "tts_preprocessor-English-US_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "mapping_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/mapping.txt" }, "max_input_length": { "string_value": "2000" }, "end_of_emphasis_token": { "string_value": "]" }, "abbreviations_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/abbr.txt" }, "norm_proto_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/" }, "language": { "string_value": "en-US" }, "phone_set": { "string_value": "arpabet" }, "max_sequence_length": { "string_value": "400" }, "enable_emphasis_tag": { "string_value": "True" }, "pad_with_space": { "string_value": "True" }, "start_of_emphasis_token": { "string_value": "[" }, "supports_ragged_batches": { "string_value": "True" }, "dictionary_path": { "string_value": "/data/models/tts_preprocessor-English-US/1/18d619cc5aaf458bbda98fe02588e6a1_cmudict-0.7b_nv22.01" }, "upper_case_chars": { "string_value": "False" }, "g2p_ignore_ambiguous": { "string_value": "True" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": true } } I1120 13:44:14.414410 86 tts-preprocessor.cc:282] TRITONBACKEND_ModelInstanceInitialize: tts_preprocessor-English-US_0 (device 0) I1120 13:44:14.414880 86 model_repository_manager.cc:1231] successfully loaded 'tts_preprocessor-English-US' version 1 I1120 13:44:14.419349 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming:1 I1120 13:44:14.519909 86 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline:1 I1120 13:44:14.620527 86 model_repository_manager.cc:1077] loading: fastpitch_hifigan_ensemble-English-US:1 I1120 13:44:14.721051 86 model_repository_manager.cc:1077] loading: riva_intent_weather:1 I1120 13:44:14.821779 86 model_repository_manager.cc:1077] loading: riva_ner:1 > Riva waiting for Triton server to load all models...retrying in 1 second I1120 13:44:14.922710 86 model_repository_manager.cc:1077] loading: riva_qa:1 I1120 13:44:15.023665 86 model_repository_manager.cc:1077] loading: riva_text_classification_domain:1 I1120 13:44:15.124533 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline' version 1 I1120 13:44:15.124584 86 model_repository_manager.cc:1231] successfully loaded 'riva_intent_weather' version 1 I1120 13:44:15.124673 86 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming' version 1 I1120 13:44:15.124892 86 model_repository_manager.cc:1231] successfully loaded 'riva_ner' version 1 I1120 13:44:15.124996 86 model_repository_manager.cc:1231] successfully loaded 'riva_qa' version 1 I1120 13:44:15.125011 86 model_repository_manager.cc:1231] successfully loaded 'fastpitch_hifigan_ensemble-English-US' version 1 I1120 13:44:15.125806 86 model_repository_manager.cc:1231] successfully loaded 'riva_text_classification_domain' version 1 I1120 13:44:15.126255 86 server.cc:549] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+ I1120 13:44:15.126609 86 server.cc:576] +------------------------+---------------------------------------------------------------------------------------+--------+ | Backend | Path | Config | +------------------------+---------------------------------------------------------------------------------------+--------+ | riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {} | | riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {} | | riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {} | | riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {} | | onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} | | pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} | | riva_asr_decoder | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so | {} | | riva_tts_postprocessor | /opt/tritonserver/backends/riva_tts_postprocessor/libtriton_riva_tts_postprocessor.so | {} | | riva_nlp_detokenizer | /opt/tritonserver/backends/riva_nlp_detokenizer/libtriton_riva_nlp_detokenizer.so | {} | | riva_asr_endpointing | /opt/tritonserver/backends/riva_asr_endpointing/libtriton_riva_asr_endpointing.so | {} | | tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {} | | riva_nlp_seqlabel | /opt/tritonserver/backends/riva_nlp_seqlabel/libtriton_riva_nlp_seqlabel.so | {} | | riva_nlp_tokenizer | /opt/tritonserver/backends/riva_nlp_tokenizer/libtriton_riva_nlp_tokenizer.so | {} | | riva_nlp_qa | /opt/tritonserver/backends/riva_nlp_qa/libtriton_riva_nlp_qa.so | {} | +------------------------+---------------------------------------------------------------------------------------+--------+ I1120 13:44:15.127192 86 server.cc:619] +-----------------------------------------------------------------+---------+--------+ | Model | Version | Status | +-----------------------------------------------------------------+---------+--------+ | conformer-en-US-asr-offline | 1 | READY | | conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline | 1 | READY | | conformer-en-US-asr-offline-endpointing-streaming-offline | 1 | READY | | conformer-en-US-asr-offline-feature-extractor-streaming-offline | 1 | READY | | conformer-en-US-asr-streaming | 1 | READY | | conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming | 1 | READY | | conformer-en-US-asr-streaming-endpointing-streaming | 1 | READY | | conformer-en-US-asr-streaming-feature-extractor-streaming | 1 | READY | | fastpitch_hifigan_ensemble-English-US | 1 | READY | | intent_slot_detokenizer | 1 | READY | | intent_slot_label_tokens_weather | 1 | READY | | intent_slot_tokenizer-en-US-weather | 1 | READY | | qa_qa_postprocessor | 1 | READY | | qa_tokenizer-en-US | 1 | READY | | riva-onnx-fastpitch_encoder-English-US | 1 | READY | | riva-punctuation-en-US | 1 | READY | | riva-trt-conformer-en-US-asr-offline-am-streaming-offline | 1 | READY | | riva-trt-conformer-en-US-asr-streaming-am-streaming | 1 | READY | | riva-trt-hifigan-English-US | 1 | READY | | riva-trt-riva-punctuation-en-US-nn-bert-base-uncased | 1 | READY | | riva-trt-riva_intent_weather-nn-bert-base-uncased | 1 | READY | | riva-trt-riva_ner-nn-bert-base-uncased | 1 | READY | | riva-trt-riva_qa-nn-bert-base-uncased | 1 | READY | | riva-trt-riva_text_classification_domain-nn-bert-base-uncased | 1 | READY | | riva_intent_weather | 1 | READY | | riva_ner | 1 | READY | | riva_qa | 1 | READY | | riva_text_classification_domain | 1 | READY | | spectrogram_chunker-English-US | 1 | READY | | text_classification_tokenizer-en-US-domain | 1 | READY | | token_classification_detokenizer | 1 | READY | | token_classification_label_tokens | 1 | READY | | token_classification_tokenizer-en-US | 1 | READY | | tts_postprocessor-English-US | 1 | READY | | tts_preprocessor-English-US | 1 | READY | +-----------------------------------------------------------------+---------+--------+ I1120 13:44:15.164674 86 metrics.cc:650] Collecting metrics for GPU 0: NVIDIA RTX A5000 I1120 13:44:15.165531 86 tritonserver.cc:2123] +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.21.0 | | server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace | | model_repository_path[0] | /data/models | | model_control_mode | MODE_NONE | | strict_model_config | 1 | | rate_limit | OFF | | pinned_memory_pool_byte_size | 268435456 | | cuda_memory_pool_byte_size{0} | 1000000000 | | response_cache_byte_size | 0 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ I1120 13:44:15.172115 86 grpc_server.cc:4544] Started GRPCInferenceService at 0.0.0.0:8001 I1120 13:44:15.173055 86 http_server.cc:3242] Started HTTPService at 0.0.0.0:8000 I1120 13:44:15.217011 86 http_server.cc:180] Started Metrics Service at 0.0.0.0:8002 > Triton server is ready... I1120 13:44:15.907536 384 riva_server.cc:120] Using Insecure Server Credentials I1120 13:44:15.929621 384 model_registry.cc:110] Successfully registered: conformer-en-US-asr-offline for ASR I1120 13:44:15.943508 384 model_registry.cc:110] Successfully registered: conformer-en-US-asr-streaming for ASR I1120 13:44:16.097797 384 model_registry.cc:110] Successfully registered: riva-punctuation-en-US for NLP I1120 13:44:16.107568 384 model_registry.cc:110] Successfully registered: riva_intent_weather for NLP I1120 13:44:16.108647 384 model_registry.cc:110] Successfully registered: riva_ner for NLP I1120 13:44:16.109733 384 model_registry.cc:110] Successfully registered: riva_qa for NLP I1120 13:44:16.110642 384 model_registry.cc:110] Successfully registered: riva_text_classification_domain for NLP I1120 13:44:16.344429 384 model_registry.cc:110] Successfully registered: fastpitch_hifigan_ensemble-English-US for TTS I1120 13:44:16.389441 384 riva_server.cc:160] Riva Conversational AI Server listening on 0.0.0.0:50051 W1120 13:44:16.389528 384 stats_reporter.cc:41] No API key provided. Stats reporting disabled. I1120 13:44:43.582829 391 grpc_riva_tts.cc:300] TTSService.Synthesize called. I1120 13:44:43.583415 391 grpc_riva_tts.cc:327] Using multispeaker model fastpitch_hifigan_ensemble-English-US for inference with speaker_id: 1 I1120 13:44:44.368643 391 grpc_riva_tts.cc:414] TTSService.Synthesize response contains 345088 audio samples. I1120 13:45:04.722137 391 grpc_riva_asr.cc:492] ASRService.Recognize called. I1120 13:45:04.722918 391 riva_asr_stream.cc:214] Detected format: encoding = 1 numchannels = 1 samplerate = 44100 bitspersample = 16 I1120 13:45:04.722961 391 grpc_riva_asr.cc:558] ASRService.Recognize performing streaming recognition with sequence id: 226649607 I1120 13:45:04.723071 391 grpc_riva_asr.cc:588] Using model conformer-en-US-asr-offline for inference I1120 13:45:04.723232 391 grpc_riva_asr.cc:603] Model sample rate= 16000 for inference I1120 13:45:04.729291 464 grpc_riva_asr.cc:717] Creating resampler, audio file sample rate=44100 model sample_rate=16000 I1120 13:45:04.956303 463 grpc_riva_asr.cc:320] max_alternative is greater than max_transcripts supported by server I1120 13:45:05.174083 391 grpc_riva_asr.cc:268] max_alternative is greater than max_transcripts supported by server I1120 13:45:05.174158 391 grpc_riva_asr.cc:672] ASRService.Recognize returning OK