========================== === Riva Speech Skills === ========================== NVIDIA Release (build 46434648) Copyright (c) 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:17.583304 103 libtorch.cc:1381] TRITONBACKEND_Initialize: pytorch I1213 14:41:17.583347 103 libtorch.cc:1391] Triton TRITONBACKEND API version: 1.9 I1213 14:41:17.583425 103 libtorch.cc:1397] 'pytorch' TRITONBACKEND API version: 1.9 I1213 14:41:17.590055 103 onnxruntime.cc:2400] TRITONBACKEND_Initialize: onnxruntime I1213 14:41:17.590062 103 onnxruntime.cc:2410] Triton TRITONBACKEND API version: 1.9 I1213 14:41:17.590200 103 onnxruntime.cc:2416] 'onnxruntime' TRITONBACKEND API version: 1.9 I1213 14:41:17.590203 103 onnxruntime.cc:2446] backend configuration: {} > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:18.511627 103 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f166c000000' with size 268435456 I1213 14:41:18.511775 103 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000 I1213 14:41:18.517937 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-endpointing-streaming:1 I1213 14:41:18.618763 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline-feature-extractor-streaming-offline:1 I1213 14:41:18.623173 103 endpointing_library.cc:18] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-endpointing-streaming (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W1213 14:41:18.626590 113 parameter_parser.cc:144] Parameter 'chunk_size' set but unused. W1213 14:41:18.626618 113 parameter_parser.cc:144] Parameter 'ms_per_timestep' set but unused. W1213 14:41:18.626622 113 parameter_parser.cc:144] Parameter 'residue_blanks_at_end' set but unused. W1213 14:41:18.626626 113 parameter_parser.cc:144] Parameter 'residue_blanks_at_start' set but unused. W1213 14:41:18.626628 113 parameter_parser.cc:144] Parameter 'start_history' set but unused. W1213 14:41:18.626631 113 parameter_parser.cc:144] Parameter 'start_th' set but unused. W1213 14:41:18.626634 113 parameter_parser.cc:144] Parameter 'stop_history' set but unused. W1213 14:41:18.626637 113 parameter_parser.cc:144] Parameter 'stop_th' set but unused. W1213 14:41:18.626641 113 parameter_parser.cc:144] Parameter 'streaming' set but unused. W1213 14:41:18.626643 113 parameter_parser.cc:144] Parameter 'use_subword' set but unused. W1213 14:41:18.626646 113 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. I1213 14:41:18.627101 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-endpointing-streaming", "platform": "", "backend": "riva_asr_endpointing", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 2048, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-endpointing-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "endpointing_type": { "string_value": "greedy_ctc" }, "stop_th": { "string_value": "0.98" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-endpointing-streaming/1/riva_decoder_vocabulary.txt" }, "start_history": { "string_value": "200" }, "residue_blanks_at_start": { "string_value": "-2" }, "ms_per_timestep": { "string_value": "40" }, "streaming": { "string_value": "True" }, "use_subword": { "string_value": "True" }, "stop_history": { "string_value": "800" }, "residue_blanks_at_end": { "string_value": "0" }, "start_th": { "string_value": "0.2" }, "chunk_size": { "string_value": "0.16" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:18.627504 103 endpointing_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-endpointing-streaming_0 (device 0) I1213 14:41:18.659704 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-endpointing-streaming' version 1 I1213 14:41:18.719225 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming:1 I1213 14:41:18.722057 103 feature-extractor.cc:400] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-offline-feature-extractor-streaming-offline (version 1) I1213 14:41:18.730297 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-offline-feature-extractor-streaming-offline", "platform": "", "backend": "riva_asr_features", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 512, "input": [ { "name": "AUDIO_SIGNAL", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SAMPLE_RATE", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "AUDIO_FEATURES", "data_type": "TYPE_FP32", "dims": [ 80, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_PROCESSED", "data_type": "TYPE_FP32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_FEATURES_LENGTH", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 512, "preferred_batch_size": [ 256, 512 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-offline-feature-extractor-streaming-offline_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "streaming": { "string_value": "True" }, "left_padding_size": { "string_value": "1.6" }, "stddev_floor": { "string_value": "1e-05" }, "transpose": { "string_value": "False" }, "right_padding_size": { "string_value": "1.6" }, "gain": { "string_value": "1.0" }, "use_utterance_norm_params": { "string_value": "False" }, "precalc_norm_time_steps": { "string_value": "0" }, "precalc_norm_params": { "string_value": "False" }, "apply_normalization": { "string_value": "True" }, "dither": { "string_value": "0.0" }, "norm_per_feature": { "string_value": "True" }, "mean": { "string_value": "-11.4412, -9.9334, -9.1292, -9.0365, -9.2804, -9.5643, -9.7342, -9.6925, -9.6333, -9.2808, -9.1887, -9.1422, -9.1397, -9.2028, -9.2749, -9.4776, -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734, -9.9257, -9.6557, -9.1761, -9.6653, -9.7876, -9.7230, -9.7792, -9.7056, -9.2702, -9.4650, -9.2755, -9.1369, -9.1174, -8.9197, -8.5394, -8.2614, -8.1353, -8.1422, -8.3430, -8.6655" }, "stddev": { "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181" }, "chunk_size": { "string_value": "4.8" }, "max_execution_batch_size": { "string_value": "512" }, "sample_rate": { "string_value": "16000" }, "window_size": { "string_value": "0.025" }, "num_features": { "string_value": "80" }, "window_stride": { "string_value": "0.01" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:18.730896 103 feature-extractor.cc:402] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-offline-feature-extractor-streaming-offline_0 (device 0) I1213 14:41:18.819565 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline:1 I1213 14:41:18.919798 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline-endpointing-streaming-offline:1 I1213 14:41:19.020064 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-feature-extractor-streaming:1 I1213 14:41:19.120386 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming:1 > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:19.220695 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-throughput-endpointing-streaming:1 I1213 14:41:19.320986 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-streaming-throughput-feature-extractor-streaming:1 I1213 14:41:19.421303 103 model_repository_manager.cc:1077] loading: riva-punctuation-en-US:1 I1213 14:41:19.521663 103 model_repository_manager.cc:1077] loading: riva-trt-conformer-en-US-asr-offline-am-streaming-offline:1 I1213 14:41:19.598212 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline-feature-extractor-streaming-offline' version 1 I1213 14:41:19.606813 103 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W1213 14:41:19.608075 138 parameter_parser.cc:144] Parameter 'beam_size' set but unused. W1213 14:41:19.608089 138 parameter_parser.cc:144] Parameter 'beam_size_token' set but unused. W1213 14:41:19.608090 138 parameter_parser.cc:144] Parameter 'beam_threshold' set but unused. W1213 14:41:19.608091 138 parameter_parser.cc:144] Parameter 'blank_token' set but unused. W1213 14:41:19.608093 138 parameter_parser.cc:144] Parameter 'decoder_num_worker_threads' set but unused. W1213 14:41:19.608094 138 parameter_parser.cc:144] Parameter 'forerunner_beam_size' set but unused. W1213 14:41:19.608095 138 parameter_parser.cc:144] Parameter 'forerunner_beam_size_token' set but unused. W1213 14:41:19.608098 138 parameter_parser.cc:144] Parameter 'forerunner_beam_threshold' set but unused. W1213 14:41:19.608098 138 parameter_parser.cc:144] Parameter 'forerunner_use_lm' set but unused. W1213 14:41:19.608099 138 parameter_parser.cc:144] Parameter 'language_model_file' set but unused. W1213 14:41:19.608100 138 parameter_parser.cc:144] Parameter 'lexicon_file' set but unused. W1213 14:41:19.608103 138 parameter_parser.cc:144] Parameter 'lm_weight' set but unused. W1213 14:41:19.608103 138 parameter_parser.cc:144] Parameter 'log_add' set but unused. W1213 14:41:19.608104 138 parameter_parser.cc:144] Parameter 'max_execution_batch_size' set but unused. W1213 14:41:19.608106 138 parameter_parser.cc:144] Parameter 'max_supported_transcripts' set but unused. W1213 14:41:19.608107 138 parameter_parser.cc:144] Parameter 'num_tokenization' set but unused. W1213 14:41:19.608108 138 parameter_parser.cc:144] Parameter 'profane_words_file' set but unused. W1213 14:41:19.608110 138 parameter_parser.cc:144] Parameter 'set_default_index_to_unk_token' set but unused. W1213 14:41:19.608111 138 parameter_parser.cc:144] Parameter 'sil_token' set but unused. W1213 14:41:19.608112 138 parameter_parser.cc:144] Parameter 'smearing_mode' set but unused. W1213 14:41:19.608114 138 parameter_parser.cc:144] Parameter 'tokenizer_model' set but unused. W1213 14:41:19.608115 138 parameter_parser.cc:144] Parameter 'unk_score' set but unused. W1213 14:41:19.608117 138 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1213 14:41:19.608119 138 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. W1213 14:41:19.608119 138 parameter_parser.cc:144] Parameter 'word_insertion_score' set but unused. I1213 14:41:19.608844 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming", "platform": "", "backend": "riva_asr_decoder", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "END_FLAG", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "CUSTOM_CONFIGURATION", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "FINAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_TRANSCRIPTS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS_STABILITY", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 32, 64 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "decoder_type": { "string_value": "flashlight" }, "forerunner_beam_size": { "string_value": "8" }, "unk_score": { "string_value": "-inf" }, "chunk_size": { "string_value": "0.16" }, "max_supported_transcripts": { "string_value": "1" }, "lexicon_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt" }, "smearing_mode": { "string_value": "max" }, "log_add": { "string_value": "True" }, "blank_token": { "string_value": "#" }, "lm_weight": { "string_value": "0.8" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt" }, "ms_per_timestep": { "string_value": "40" }, "streaming": { "string_value": "True" }, "use_subword": { "string_value": "True" }, "beam_size": { "string_value": "32" }, "right_padding_size": { "string_value": "1.92" }, "beam_size_token": { "string_value": "16" }, "sil_token": { "string_value": "▁" }, "num_tokenization": { "string_value": "1" }, "beam_threshold": { "string_value": "20.0" }, "language_model_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/4gram-pruned-0_2_7_9-en-lm-set-2.0.bin" }, "tokenizer_model": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/8b8f095152034e98b24ab33726708bd0_tokenizer.model" }, "max_execution_batch_size": { "string_value": "1024" }, "forerunner_use_lm": { "string_value": "true" }, "profane_words_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/profane_words_file.txt" }, "forerunner_beam_size_token": { "string_value": "8" }, "forerunner_beam_threshold": { "string_value": "10.0" }, "decoder_num_worker_threads": { "string_value": "-1" }, "asr_model_delay": { "string_value": "-1" }, "word_insertion_score": { "string_value": "1.0" }, "unk_token": { "string_value": "" }, "left_padding_size": { "string_value": "1.92" }, "set_default_index_to_unk_token": { "string_value": "False" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:19.608918 103 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline (version 1) W1213 14:41:19.609658 139 parameter_parser.cc:144] Parameter 'beam_size' set but unused. W1213 14:41:19.609663 139 parameter_parser.cc:144] Parameter 'beam_size_token' set but unused. W1213 14:41:19.609664 139 parameter_parser.cc:144] Parameter 'beam_threshold' set but unused. W1213 14:41:19.609665 139 parameter_parser.cc:144] Parameter 'blank_token' set but unused. W1213 14:41:19.609668 139 parameter_parser.cc:144] Parameter 'decoder_num_worker_threads' set but unused. W1213 14:41:19.609669 139 parameter_parser.cc:144] Parameter 'forerunner_beam_size' set but unused. W1213 14:41:19.609670 139 parameter_parser.cc:144] Parameter 'forerunner_beam_size_token' set but unused. W1213 14:41:19.609671 139 parameter_parser.cc:144] Parameter 'forerunner_beam_threshold' set but unused. W1213 14:41:19.609673 139 parameter_parser.cc:144] Parameter 'forerunner_use_lm' set but unused. W1213 14:41:19.609674 139 parameter_parser.cc:144] Parameter 'language_model_file' set but unused. W1213 14:41:19.609676 139 parameter_parser.cc:144] Parameter 'lexicon_file' set but unused. W1213 14:41:19.609678 139 parameter_parser.cc:144] Parameter 'lm_weight' set but unused. W1213 14:41:19.609679 139 parameter_parser.cc:144] Parameter 'log_add' set but unused. W1213 14:41:19.609680 139 parameter_parser.cc:144] Parameter 'max_execution_batch_size' set but unused. W1213 14:41:19.609683 139 parameter_parser.cc:144] Parameter 'max_supported_transcripts' set but unused. W1213 14:41:19.609683 139 parameter_parser.cc:144] Parameter 'num_tokenization' set but unused. W1213 14:41:19.609685 139 parameter_parser.cc:144] Parameter 'profane_words_file' set but unused. W1213 14:41:19.609686 139 parameter_parser.cc:144] Parameter 'set_default_index_to_unk_token' set but unused. W1213 14:41:19.609688 139 parameter_parser.cc:144] Parameter 'sil_token' set but unused. W1213 14:41:19.609689 139 parameter_parser.cc:144] Parameter 'smearing_mode' set but unused. W1213 14:41:19.609691 139 parameter_parser.cc:144] Parameter 'tokenizer_model' set but unused. W1213 14:41:19.609692 139 parameter_parser.cc:144] Parameter 'unk_score' set but unused. W1213 14:41:19.609694 139 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1213 14:41:19.609695 139 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. W1213 14:41:19.609696 139 parameter_parser.cc:144] Parameter 'word_insertion_score' set but unused. I1213 14:41:19.610323 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline", "platform": "", "backend": "riva_asr_decoder", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "END_FLAG", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "CUSTOM_CONFIGURATION", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "FINAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_TRANSCRIPTS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS_STABILITY", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 32, 64 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "left_padding_size": { "string_value": "1.6" }, "set_default_index_to_unk_token": { "string_value": "False" }, "decoder_type": { "string_value": "flashlight" }, "forerunner_beam_size": { "string_value": "8" }, "unk_score": { "string_value": "-inf" }, "max_supported_transcripts": { "string_value": "1" }, "chunk_size": { "string_value": "4.8" }, "lexicon_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/lexicon.txt" }, "smearing_mode": { "string_value": "max" }, "log_add": { "string_value": "True" }, "lm_weight": { "string_value": "0.8" }, "blank_token": { "string_value": "#" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/riva_decoder_vocabulary.txt" }, "ms_per_timestep": { "string_value": "40" }, "use_subword": { "string_value": "True" }, "streaming": { "string_value": "True" }, "beam_size": { "string_value": "32" }, "right_padding_size": { "string_value": "1.6" }, "beam_size_token": { "string_value": "16" }, "sil_token": { "string_value": "▁" }, "num_tokenization": { "string_value": "1" }, "beam_threshold": { "string_value": "20.0" }, "language_model_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/4gram-pruned-0_2_7_9-en-lm-set-2.0.bin" }, "tokenizer_model": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/8b8f095152034e98b24ab33726708bd0_tokenizer.model" }, "max_execution_batch_size": { "string_value": "1024" }, "forerunner_use_lm": { "string_value": "true" }, "forerunner_beam_size_token": { "string_value": "8" }, "profane_words_file": { "string_value": "/data/models/conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline/1/profane_words_file.txt" }, "forerunner_beam_threshold": { "string_value": "10.0" }, "decoder_num_worker_threads": { "string_value": "-1" }, "asr_model_delay": { "string_value": "-1" }, "word_insertion_score": { "string_value": "1.0" }, "unk_token": { "string_value": "" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:19.610473 103 endpointing_library.cc:18] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-offline-endpointing-streaming-offline (version 1) W1213 14:41:19.611017 140 parameter_parser.cc:144] Parameter 'chunk_size' set but unused. W1213 14:41:19.611023 140 parameter_parser.cc:144] Parameter 'ms_per_timestep' set but unused. W1213 14:41:19.611025 140 parameter_parser.cc:144] Parameter 'residue_blanks_at_end' set but unused. W1213 14:41:19.611027 140 parameter_parser.cc:144] Parameter 'residue_blanks_at_start' set but unused. W1213 14:41:19.611028 140 parameter_parser.cc:144] Parameter 'start_history' set but unused. W1213 14:41:19.611030 140 parameter_parser.cc:144] Parameter 'start_th' set but unused. W1213 14:41:19.611032 140 parameter_parser.cc:144] Parameter 'stop_history' set but unused. W1213 14:41:19.611033 140 parameter_parser.cc:144] Parameter 'stop_th' set but unused. W1213 14:41:19.611034 140 parameter_parser.cc:144] Parameter 'streaming' set but unused. W1213 14:41:19.611037 140 parameter_parser.cc:144] Parameter 'use_subword' set but unused. W1213 14:41:19.611037 140 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. I1213 14:41:19.611317 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-offline-endpointing-streaming-offline", "platform": "", "backend": "riva_asr_endpointing", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 2048, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-offline-endpointing-streaming-offline_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "residue_blanks_at_end": { "string_value": "0" }, "start_th": { "string_value": "0.2" }, "chunk_size": { "string_value": "4.8" }, "endpointing_type": { "string_value": "greedy_ctc" }, "stop_th": { "string_value": "0.98" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-offline-endpointing-streaming-offline/1/riva_decoder_vocabulary.txt" }, "start_history": { "string_value": "200" }, "ms_per_timestep": { "string_value": "40" }, "residue_blanks_at_start": { "string_value": "-2" }, "use_subword": { "string_value": "True" }, "streaming": { "string_value": "True" }, "stop_history": { "string_value": "800" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:19.611456 103 feature-extractor.cc:400] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-feature-extractor-streaming (version 1) I1213 14:41:19.622083 103 model_repository_manager.cc:1077] loading: riva-trt-conformer-en-US-asr-streaming-am-streaming:1 I1213 14:41:19.625527 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-feature-extractor-streaming", "platform": "", "backend": "riva_asr_features", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "AUDIO_SIGNAL", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SAMPLE_RATE", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "AUDIO_FEATURES", "data_type": "TYPE_FP32", "dims": [ 80, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_PROCESSED", "data_type": "TYPE_FP32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_FEATURES_LENGTH", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 256, 512 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-feature-extractor-streaming_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "sample_rate": { "string_value": "16000" }, "num_features": { "string_value": "80" }, "window_size": { "string_value": "0.025" }, "window_stride": { "string_value": "0.01" }, "streaming": { "string_value": "True" }, "left_padding_size": { "string_value": "1.92" }, "transpose": { "string_value": "False" }, "stddev_floor": { "string_value": "1e-05" }, "right_padding_size": { "string_value": "1.92" }, "gain": { "string_value": "1.0" }, "use_utterance_norm_params": { "string_value": "False" }, "precalc_norm_time_steps": { "string_value": "0" }, "dither": { "string_value": "0.0" }, "apply_normalization": { "string_value": "True" }, "precalc_norm_params": { "string_value": "False" }, "norm_per_feature": { "string_value": "True" }, "mean": { "string_value": "-11.4412, -9.9334, -9.1292, -9.0365, -9.2804, -9.5643, -9.7342, -9.6925, -9.6333, -9.2808, -9.1887, -9.1422, -9.1397, -9.2028, -9.2749, -9.4776, -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734, -9.9257, -9.6557, -9.1761, -9.6653, -9.7876, -9.7230, -9.7792, -9.7056, -9.2702, -9.4650, -9.2755, -9.1369, -9.1174, -8.9197, -8.5394, -8.2614, -8.1353, -8.1422, -8.3430, -8.6655" }, "stddev": { "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181" }, "chunk_size": { "string_value": "0.16" }, "max_execution_batch_size": { "string_value": "1024" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:19.625697 103 ctc-decoder-library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming (version 1) W1213 14:41:19.627099 145 parameter_parser.cc:144] Parameter 'beam_size' set but unused. W1213 14:41:19.627111 145 parameter_parser.cc:144] Parameter 'beam_size_token' set but unused. W1213 14:41:19.627115 145 parameter_parser.cc:144] Parameter 'beam_threshold' set but unused. W1213 14:41:19.627116 145 parameter_parser.cc:144] Parameter 'blank_token' set but unused. W1213 14:41:19.627120 145 parameter_parser.cc:144] Parameter 'decoder_num_worker_threads' set but unused. W1213 14:41:19.627125 145 parameter_parser.cc:144] Parameter 'forerunner_beam_size' set but unused. W1213 14:41:19.627127 145 parameter_parser.cc:144] Parameter 'forerunner_beam_size_token' set but unused. W1213 14:41:19.627130 145 parameter_parser.cc:144] Parameter 'forerunner_beam_threshold' set but unused. W1213 14:41:19.627133 145 parameter_parser.cc:144] Parameter 'forerunner_use_lm' set but unused. W1213 14:41:19.627136 145 parameter_parser.cc:144] Parameter 'language_model_file' set but unused. W1213 14:41:19.627139 145 parameter_parser.cc:144] Parameter 'lexicon_file' set but unused. W1213 14:41:19.627143 145 parameter_parser.cc:144] Parameter 'lm_weight' set but unused. W1213 14:41:19.627146 145 parameter_parser.cc:144] Parameter 'log_add' set but unused. W1213 14:41:19.627151 145 parameter_parser.cc:144] Parameter 'max_execution_batch_size' set but unused. W1213 14:41:19.627154 145 parameter_parser.cc:144] Parameter 'max_supported_transcripts' set but unused. W1213 14:41:19.627157 145 parameter_parser.cc:144] Parameter 'num_tokenization' set but unused. W1213 14:41:19.627161 145 parameter_parser.cc:144] Parameter 'profane_words_file' set but unused. W1213 14:41:19.627166 145 parameter_parser.cc:144] Parameter 'set_default_index_to_unk_token' set but unused. W1213 14:41:19.627168 145 parameter_parser.cc:144] Parameter 'sil_token' set but unused. W1213 14:41:19.627171 145 parameter_parser.cc:144] Parameter 'smearing_mode' set but unused. W1213 14:41:19.627177 145 parameter_parser.cc:144] Parameter 'tokenizer_model' set but unused. W1213 14:41:19.627179 145 parameter_parser.cc:144] Parameter 'unk_score' set but unused. W1213 14:41:19.627183 145 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1213 14:41:19.627187 145 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. W1213 14:41:19.627190 145 parameter_parser.cc:144] Parameter 'word_insertion_score' set but unused. I1213 14:41:19.627922 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming", "platform": "", "backend": "riva_asr_decoder", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "END_FLAG", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "CUSTOM_CONFIGURATION", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "FINAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_TRANSCRIPTS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS_STABILITY", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 32, 64 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "lexicon_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming/1/lexicon.txt" }, "smearing_mode": { "string_value": "max" }, "log_add": { "string_value": "True" }, "lm_weight": { "string_value": "0.8" }, "blank_token": { "string_value": "#" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt" }, "ms_per_timestep": { "string_value": "40" }, "streaming": { "string_value": "True" }, "use_subword": { "string_value": "True" }, "beam_size": { "string_value": "32" }, "right_padding_size": { "string_value": "1.6" }, "beam_size_token": { "string_value": "16" }, "sil_token": { "string_value": "▁" }, "num_tokenization": { "string_value": "1" }, "beam_threshold": { "string_value": "20.0" }, "tokenizer_model": { "string_value": "/data/models/conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming/1/8b8f095152034e98b24ab33726708bd0_tokenizer.model" }, "language_model_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming/1/4gram-pruned-0_2_7_9-en-lm-set-2.0.bin" }, "max_execution_batch_size": { "string_value": "1024" }, "forerunner_use_lm": { "string_value": "true" }, "profane_words_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming/1/profane_words_file.txt" }, "forerunner_beam_size_token": { "string_value": "8" }, "forerunner_beam_threshold": { "string_value": "10.0" }, "decoder_num_worker_threads": { "string_value": "-1" }, "asr_model_delay": { "string_value": "-1" }, "word_insertion_score": { "string_value": "1.0" }, "unk_token": { "string_value": "" }, "left_padding_size": { "string_value": "1.6" }, "set_default_index_to_unk_token": { "string_value": "False" }, "decoder_type": { "string_value": "flashlight" }, "forerunner_beam_size": { "string_value": "8" }, "unk_score": { "string_value": "-inf" }, "chunk_size": { "string_value": "0.8" }, "max_supported_transcripts": { "string_value": "1" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:19.628005 103 endpointing_library.cc:18] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-throughput-endpointing-streaming (version 1) W1213 14:41:19.629113 146 parameter_parser.cc:144] Parameter 'chunk_size' set but unused. W1213 14:41:19.629129 146 parameter_parser.cc:144] Parameter 'ms_per_timestep' set but unused. W1213 14:41:19.629135 146 parameter_parser.cc:144] Parameter 'residue_blanks_at_end' set but unused. W1213 14:41:19.629138 146 parameter_parser.cc:144] Parameter 'residue_blanks_at_start' set but unused. W1213 14:41:19.629142 146 parameter_parser.cc:144] Parameter 'start_history' set but unused. W1213 14:41:19.629145 146 parameter_parser.cc:144] Parameter 'start_th' set but unused. W1213 14:41:19.629148 146 parameter_parser.cc:144] Parameter 'stop_history' set but unused. W1213 14:41:19.629153 146 parameter_parser.cc:144] Parameter 'stop_th' set but unused. W1213 14:41:19.629155 146 parameter_parser.cc:144] Parameter 'streaming' set but unused. W1213 14:41:19.629158 146 parameter_parser.cc:144] Parameter 'use_subword' set but unused. W1213 14:41:19.629163 146 parameter_parser.cc:144] Parameter 'vocab_file' set but unused. I1213 14:41:19.629555 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-throughput-endpointing-streaming", "platform": "", "backend": "riva_asr_endpointing", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 2048, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEGMENTS_START_END", "data_type": "TYPE_FP32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-throughput-endpointing-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "stop_history": { "string_value": "800" }, "residue_blanks_at_end": { "string_value": "0" }, "start_th": { "string_value": "0.2" }, "chunk_size": { "string_value": "0.8" }, "endpointing_type": { "string_value": "greedy_ctc" }, "stop_th": { "string_value": "0.98" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-throughput-endpointing-streaming/1/riva_decoder_vocabulary.txt" }, "start_history": { "string_value": "200" }, "ms_per_timestep": { "string_value": "40" }, "residue_blanks_at_start": { "string_value": "-2" }, "streaming": { "string_value": "True" }, "use_subword": { "string_value": "True" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:19.661290 103 pipeline_library.cc:22] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W1213 14:41:19.662381 148 parameter_parser.cc:144] Parameter 'attn_mask_tensor_name' set but unused. W1213 14:41:19.662396 148 parameter_parser.cc:144] Parameter 'bos_token' set but unused. W1213 14:41:19.662398 148 parameter_parser.cc:144] Parameter 'capit_logits_tensor_name' set but unused. W1213 14:41:19.662400 148 parameter_parser.cc:144] Parameter 'capitalization_mapping_path' set but unused. W1213 14:41:19.662402 148 parameter_parser.cc:144] Parameter 'delimiter' set but unused. W1213 14:41:19.662405 148 parameter_parser.cc:144] Parameter 'eos_token' set but unused. W1213 14:41:19.662406 148 parameter_parser.cc:144] Parameter 'input_ids_tensor_name' set but unused. W1213 14:41:19.662408 148 parameter_parser.cc:144] Parameter 'language_code' set but unused. W1213 14:41:19.662410 148 parameter_parser.cc:144] Parameter 'model_api' set but unused. W1213 14:41:19.662412 148 parameter_parser.cc:144] Parameter 'model_family' set but unused. W1213 14:41:19.662415 148 parameter_parser.cc:144] Parameter 'pad_chars_with_spaces' set but unused. W1213 14:41:19.662416 148 parameter_parser.cc:144] Parameter 'preserve_accents' set but unused. W1213 14:41:19.662418 148 parameter_parser.cc:144] Parameter 'punct_logits_tensor_name' set but unused. W1213 14:41:19.662420 148 parameter_parser.cc:144] Parameter 'punctuation_mapping_path' set but unused. W1213 14:41:19.662421 148 parameter_parser.cc:144] Parameter 'remove_spaces' set but unused. W1213 14:41:19.662423 148 parameter_parser.cc:144] Parameter 'to_lower' set but unused. W1213 14:41:19.662425 148 parameter_parser.cc:144] Parameter 'token_type_tensor_name' set but unused. W1213 14:41:19.662427 148 parameter_parser.cc:144] Parameter 'tokenizer_to_lower' set but unused. W1213 14:41:19.662429 148 parameter_parser.cc:144] Parameter 'unk_token' set but unused. W1213 14:41:19.662431 148 parameter_parser.cc:144] Parameter 'use_int64_nn_inputs' set but unused. W1213 14:41:19.662433 148 parameter_parser.cc:144] Parameter 'vocab' set but unused. W1213 14:41:19.662566 148 parameter_parser.cc:144] Parameter 'model_api' set but unused. W1213 14:41:19.662570 148 parameter_parser.cc:144] Parameter 'model_family' set but unused. I1213 14:41:19.662661 103 backend_model.cc:303] model configuration: { "name": "riva-punctuation-en-US", "platform": "", "backend": "riva_nlp_pipeline", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 8, "input": [ { "name": "PIPELINE_INPUT", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "PIPELINE_OUTPUT", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "riva-punctuation-en-US_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "tokenizer_to_lower": { "string_value": "true" }, "vocab": { "string_value": "/data/models/riva-punctuation-en-US/1/e222f352288a423da453a79b96cc7b75_vocab.txt" }, "capit_logits_tensor_name": { "string_value": "capit_logits" }, "pipeline_type": { "string_value": "punctuation" }, "eos_token": { "string_value": "[SEP]" }, "capitalization_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/fb06800834e74de1bdc32db51da9619c_capit_label_ids.csv" }, "token_type_tensor_name": { "string_value": "token_type_ids" }, "tokenizer": { "string_value": "wordpiece" }, "delimiter": { "string_value": " " }, "pad_chars_with_spaces": { "string_value": "False" }, "remove_spaces": { "string_value": "False" }, "use_int64_nn_inputs": { "string_value": "False" }, "preserve_accents": { "string_value": "false" }, "unk_token": { "string_value": "[UNK]" }, "model_family": { "string_value": "riva" }, "bos_token": { "string_value": "[CLS]" }, "punctuation_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/15eace99434b4c87ba28cbd294b48f43_punct_label_ids.csv" }, "model_api": { "string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText" }, "to_lower": { "string_value": "true" }, "load_model": { "string_value": "false" }, "attn_mask_tensor_name": { "string_value": "attention_mask" }, "punct_logits_tensor_name": { "string_value": "punct_logits" }, "language_code": { "string_value": "en-US" }, "model_name": { "string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased" }, "input_ids_tensor_name": { "string_value": "input_ids" } }, "model_warmup": [] } I1213 14:41:19.662816 103 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0) I1213 14:41:19.722382 103 model_repository_manager.cc:1077] loading: riva-trt-conformer-en-US-asr-streaming-throughput-am-streaming:1 I1213 14:41:19.822670 103 model_repository_manager.cc:1077] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1 > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:20.788633 138 ctc-decoder.cc:174] Beam Decoder initialized successfully! I1213 14:41:20.788796 103 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline_0 (device 0) I1213 14:41:20.789219 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:21.984439 139 ctc-decoder.cc:174] Beam Decoder initialized successfully! I1213 14:41:21.984606 103 endpointing_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-offline-endpointing-streaming-offline_0 (device 0) I1213 14:41:21.984815 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline' version 1 I1213 14:41:22.005238 103 feature-extractor.cc:402] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-feature-extractor-streaming_0 (device 0) I1213 14:41:22.005865 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline-endpointing-streaming-offline' version 1 I1213 14:41:22.017900 103 ctc-decoder-library.cc:23] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming_0 (device 0) I1213 14:41:22.018174 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-feature-extractor-streaming' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:23.108088 145 ctc-decoder.cc:174] Beam Decoder initialized successfully! I1213 14:41:23.108254 103 endpointing_library.cc:22] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-throughput-endpointing-streaming_0 (device 0) I1213 14:41:23.108538 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming' version 1 I1213 14:41:23.129386 103 feature-extractor.cc:400] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-throughput-feature-extractor-streaming (version 1) I1213 14:41:23.130044 103 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-throughput-feature-extractor-streaming", "platform": "", "backend": "riva_asr_features", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "AUDIO_SIGNAL", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SAMPLE_RATE", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "AUDIO_FEATURES", "data_type": "TYPE_FP32", "dims": [ 80, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_PROCESSED", "data_type": "TYPE_FP32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "AUDIO_FEATURES_LENGTH", "data_type": "TYPE_INT32", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 256, 512 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-throughput-feature-extractor-streaming_0", "kind": "KIND_GPU", "count": 1, "gpus": [ 0 ], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "mean": { "string_value": "-11.4412, -9.9334, -9.1292, -9.0365, -9.2804, -9.5643, -9.7342, -9.6925, -9.6333, -9.2808, -9.1887, -9.1422, -9.1397, -9.2028, -9.2749, -9.4776, -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734, -9.9257, -9.6557, -9.1761, -9.6653, -9.7876, -9.7230, -9.7792, -9.7056, -9.2702, -9.4650, -9.2755, -9.1369, -9.1174, -8.9197, -8.5394, -8.2614, -8.1353, -8.1422, -8.3430, -8.6655" }, "stddev": { "string_value": "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181" }, "chunk_size": { "string_value": "0.8" }, "max_execution_batch_size": { "string_value": "1024" }, "sample_rate": { "string_value": "16000" }, "num_features": { "string_value": "80" }, "window_size": { "string_value": "0.025" }, "window_stride": { "string_value": "0.01" }, "streaming": { "string_value": "True" }, "transpose": { "string_value": "False" }, "left_padding_size": { "string_value": "1.6" }, "stddev_floor": { "string_value": "1e-05" }, "right_padding_size": { "string_value": "1.6" }, "gain": { "string_value": "1.0" }, "use_utterance_norm_params": { "string_value": "False" }, "precalc_norm_time_steps": { "string_value": "0" }, "precalc_norm_params": { "string_value": "False" }, "apply_normalization": { "string_value": "True" }, "dither": { "string_value": "0.0" }, "norm_per_feature": { "string_value": "True" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I1213 14:41:23.130148 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-throughput-endpointing-streaming' version 1 I1213 14:41:23.133012 103 tensorrt.cc:5294] TRITONBACKEND_Initialize: tensorrt I1213 14:41:23.133205 103 tensorrt.cc:5304] Triton TRITONBACKEND API version: 1.9 I1213 14:41:23.133396 103 tensorrt.cc:5310] 'tensorrt' TRITONBACKEND API version: 1.9 I1213 14:41:23.133656 103 tensorrt.cc:5353] backend configuration: {} I1213 14:41:23.133671 103 feature-extractor.cc:402] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-throughput-feature-extractor-streaming_0 (device 0) I1213 14:41:23.136693 103 pipeline_library.cc:25] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0) I1213 14:41:23.137011 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-streaming-throughput-feature-extractor-streaming' version 1 I1213 14:41:23.146284 103 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-conformer-en-US-asr-offline-am-streaming-offline (version 1) I1213 14:41:23.146533 103 model_repository_manager.cc:1231] successfully loaded 'riva-punctuation-en-US' version 1 I1213 14:41:23.146940 103 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-conformer-en-US-asr-streaming-am-streaming (version 1) I1213 14:41:23.147300 103 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased (version 1) I1213 14:41:23.147722 103 tensorrt.cc:5405] TRITONBACKEND_ModelInitialize: riva-trt-conformer-en-US-asr-streaming-throughput-am-streaming (version 1) I1213 14:41:23.148078 103 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 (GPU device 0) > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:23.687726 103 logging.cc:49] [MemUsageChange] Init CUDA: CPU +450, GPU +0, now: CPU 3156, GPU 6758 (MiB) I1213 14:41:24.117139 103 logging.cc:49] Loaded engine size: 367 MiB > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:24.252904 103 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 4006, GPU 7054 (MiB) I1213 14:41:24.384562 103 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +127, GPU +60, now: CPU 4133, GPU 7114 (MiB) I1213 14:41:24.385761 103 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +281, now: CPU 0, GPU 281 (MiB) I1213 14:41:24.411641 103 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3398, GPU 7106 (MiB) I1213 14:41:24.412102 103 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3398, GPU 7114 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:25.361509 103 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +362, now: CPU 0, GPU 643 (MiB) E1213 14:41:26.113218 103 logging.cc:43] 1: [convolutionRunner.cpp::executeConv::511] Error Code 1: Cudnn (CUDNN_STATUS_ALLOC_FAILED) W1213 14:41:26.113738 103 tensorrt.cc:5086] unable to record CUDA graph for riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 E1213 14:41:26.142181 103 logging.cc:43] 1: [convolutionRunner.cpp::executeConv::511] Error Code 1: Cudnn (CUDNN_STATUS_ALLOC_FAILED) W1213 14:41:26.142199 103 tensorrt.cc:5086] unable to record CUDA graph for riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 I1213 14:41:26.142205 103 tensorrt.cc:1411] Created instance riva-trt-conformer-en-US-asr-offline-am-streaming-offline_0 on GPU 0 with stream priority 0 and optimization profile default[0]; I1213 14:41:26.142344 103 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-conformer-en-US-asr-streaming-am-streaming_0 (GPU device 0) I1213 14:41:26.142453 103 model_repository_manager.cc:1231] successfully loaded 'riva-trt-conformer-en-US-asr-offline-am-streaming-offline' version 1 I1213 14:41:26.144573 103 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 4295, GPU 7956 (MiB) > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:26.719303 103 logging.cc:49] Loaded engine size: 432 MiB E1213 14:41:26.721904 103 logging.cc:43] 1: [cudaResources.cpp::ScopedCudaStream::37] Error Code 1: Cuda Runtime (out of memory) E1213 14:41:26.721918 103 logging.cc:43] 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.) I1213 14:41:26.767387 103 tensorrt.cc:5492] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.767435 103 tensorrt.cc:5431] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.767558 103 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased_0 (GPU device 0) E1213 14:41:26.767637 103 model_repository_manager.cc:1234] failed to load 'riva-trt-conformer-en-US-asr-streaming-am-streaming' version 1: Internal: unable to create TensorRT engine I1213 14:41:26.770096 103 tensorrt.cc:5492] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.770117 103 tensorrt.cc:5431] TRITONBACKEND_ModelFinalize: delete model state E1213 14:41:26.770125 103 model_repository_manager.cc:1234] failed to load 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1: Internal: unable to create stream: out of memory I1213 14:41:26.770202 103 tensorrt.cc:5454] TRITONBACKEND_ModelInstanceInitialize: riva-trt-conformer-en-US-asr-streaming-throughput-am-streaming_0 (GPU device 0) I1213 14:41:26.770734 103 tensorrt.cc:5492] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.770764 103 tensorrt.cc:5431] TRITONBACKEND_ModelFinalize: delete model state E1213 14:41:26.770780 103 model_repository_manager.cc:1234] failed to load 'riva-trt-conformer-en-US-asr-streaming-throughput-am-streaming' version 1: Internal: unable to create stream: out of memory E1213 14:41:26.771072 103 model_repository_manager.cc:1420] Invalid argument: ensemble 'conformer-en-US-asr-streaming' depends on 'riva-trt-conformer-en-US-asr-streaming-am-streaming' which has no loaded version E1213 14:41:26.771077 103 model_repository_manager.cc:1420] Invalid argument: ensemble 'conformer-en-US-asr-streaming-throughput' depends on 'riva-trt-conformer-en-US-asr-streaming-throughput-am-streaming' which has no loaded version I1213 14:41:26.771756 103 model_repository_manager.cc:1077] loading: conformer-en-US-asr-offline:1 E1213 14:41:26.876345 103 ensemble_scheduler.cc:1306] unable to create stream for conformer-en-US-asr-offline: out of memory I1213 14:41:26.876645 103 model_repository_manager.cc:1231] successfully loaded 'conformer-en-US-asr-offline' version 1 I1213 14:41:26.877319 103 server.cc:549] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+ I1213 14:41:26.877612 103 server.cc:576] +----------------------+-----------------------------------------------------------------------------------+--------+ | Backend | Path | Config | +----------------------+-----------------------------------------------------------------------------------+--------+ | pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} | | onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} | | riva_asr_endpointing | /opt/tritonserver/backends/riva_asr_endpointing/libtriton_riva_asr_endpointing.so | {} | | riva_asr_decoder | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so | {} | | tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {} | | riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {} | | riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {} | +----------------------+-----------------------------------------------------------------------------------+--------+ I1213 14:41:26.879091 103 server.cc:619] +----------------------------------------------------------------------+---------+---------------------------------------------------------------+ | Model | Version | Status | +----------------------------------------------------------------------+---------+---------------------------------------------------------------+ | conformer-en-US-asr-offline | 1 | READY | | conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline | 1 | READY | | conformer-en-US-asr-offline-endpointing-streaming-offline | 1 | READY | | conformer-en-US-asr-offline-feature-extractor-streaming-offline | 1 | READY | | conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming | 1 | READY | | conformer-en-US-asr-streaming-endpointing-streaming | 1 | READY | | conformer-en-US-asr-streaming-feature-extractor-streaming | 1 | READY | | conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming | 1 | READY | | conformer-en-US-asr-streaming-throughput-endpointing-streaming | 1 | READY | | conformer-en-US-asr-streaming-throughput-feature-extractor-streaming | 1 | READY | | riva-punctuation-en-US | 1 | READY | | riva-trt-conformer-en-US-asr-offline-am-streaming-offline | 1 | READY | | riva-trt-conformer-en-US-asr-streaming-am-streaming | 1 | UNAVAILABLE: Internal: unable to create TensorRT engine | | riva-trt-conformer-en-US-asr-streaming-throughput-am-streaming | 1 | UNAVAILABLE: Internal: unable to create stream: out of memory | | riva-trt-riva-punctuation-en-US-nn-bert-base-uncased | 1 | UNAVAILABLE: Internal: unable to create stream: out of memory | +----------------------------------------------------------------------+---------+---------------------------------------------------------------+ I1213 14:41:26.942943 103 metrics.cc:650] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3070 Laptop GPU I1213 14:41:26.943397 103 tritonserver.cc:2123] +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.21.0 | | server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace | | model_repository_path[0] | /data/models | | model_control_mode | MODE_NONE | | strict_model_config | 1 | | rate_limit | OFF | | pinned_memory_pool_byte_size | 268435456 | | cuda_memory_pool_byte_size{0} | 1000000000 | | response_cache_byte_size | 0 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ I1213 14:41:26.943402 103 server.cc:250] Waiting for in-flight requests to complete. I1213 14:41:26.943413 103 server.cc:266] Timeout 30: Found 0 model versions that have in-flight inferences I1213 14:41:26.943419 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-offline:1 I1213 14:41:26.943438 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline:1 I1213 14:41:26.943452 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-offline-endpointing-streaming-offline:1 I1213 14:41:26.943467 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-offline-feature-extractor-streaming-offline:1 I1213 14:41:26.943499 103 model_repository_manager.cc:1109] unloading: riva-punctuation-en-US:1 I1213 14:41:26.943546 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming:1 I1213 14:41:26.943554 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-offline' version 1 I1213 14:41:26.943568 103 model_repository_manager.cc:1109] unloading: riva-trt-conformer-en-US-asr-offline-am-streaming-offline:1 I1213 14:41:26.943591 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-streaming-throughput-feature-extractor-streaming:1 I1213 14:41:26.943601 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-streaming-endpointing-streaming:1 I1213 14:41:26.943631 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-streaming-feature-extractor-streaming:1 I1213 14:41:26.943852 103 tensorrt.cc:5492] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.943974 103 feature-extractor.cc:404] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.944397 103 endpointing_library.cc:26] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.944588 103 pipeline_library.cc:28] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.944614 103 endpointing_library.cc:26] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.944639 103 ctc-decoder-library.cc:25] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.945040 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-streaming-throughput-endpointing-streaming:1 I1213 14:41:26.945055 103 feature-extractor.cc:404] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.945088 103 model_repository_manager.cc:1109] unloading: conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming:1 I1213 14:41:26.945128 103 server.cc:281] All models are stopped, unloading models I1213 14:41:26.945143 103 server.cc:288] Timeout 30: Found 11 live models and 0 in-flight non-inference requests I1213 14:41:26.945145 103 ctc-decoder-library.cc:25] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.945184 103 ctc-decoder-library.cc:25] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.945198 103 feature-extractor.cc:404] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.945276 103 endpointing_library.cc:26] TRITONBACKEND_ModelInstanceFinalize: delete instance state I1213 14:41:26.946661 103 pipeline_library.cc:24] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.950283 103 endpointing_library.cc:21] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.950327 103 model_repository_manager.cc:1214] successfully unloaded 'riva-punctuation-en-US' version 1 I1213 14:41:26.950649 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-offline-endpointing-streaming-offline' version 1 I1213 14:41:26.950723 103 endpointing_library.cc:21] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.950970 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-streaming-endpointing-streaming' version 1 I1213 14:41:26.951930 103 endpointing_library.cc:21] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.953038 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-streaming-throughput-endpointing-streaming' version 1 I1213 14:41:26.969080 103 feature-extractor.cc:401] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.969576 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-streaming-throughput-feature-extractor-streaming' version 1 I1213 14:41:26.976509 103 feature-extractor.cc:401] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.985049 103 feature-extractor.cc:401] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:26.985930 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-offline-feature-extractor-streaming-offline' version 1 I1213 14:41:27.001560 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-streaming-feature-extractor-streaming' version 1 I1213 14:41:27.112648 103 tensorrt.cc:5431] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:27.113447 103 model_repository_manager.cc:1214] successfully unloaded 'riva-trt-conformer-en-US-asr-offline-am-streaming-offline' version 1 > Riva waiting for Triton server to load all models...retrying in 1 second I1213 14:41:27.493645 103 ctc-decoder-library.cc:22] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:27.494401 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1 I1213 14:41:27.494917 103 ctc-decoder-library.cc:22] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:27.513133 103 ctc-decoder-library.cc:22] TRITONBACKEND_ModelFinalize: delete model state I1213 14:41:27.521515 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-offline-ctc-decoder-cpu-streaming-offline' version 1 I1213 14:41:27.540046 103 model_repository_manager.cc:1214] successfully unloaded 'conformer-en-US-asr-streaming-throughput-ctc-decoder-cpu-streaming' version 1 W1213 14:41:27.945286 103 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000 I1213 14:41:27.945429 103 server.cc:288] Timeout 29: Found 0 live models and 0 in-flight non-inference requests error: creating server: Internal - failed to load all models > Riva waiting for Triton server to load all models...retrying in 1 second W1213 14:41:28.945989 103 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000 > Riva waiting for Triton server to load all models...retrying in 1 second > Triton server died before reaching ready state. Terminating Riva startup. Check Triton logs with: docker logs /opt/riva/bin/start-riva: line 1: kill: (103) - No such process