========================== === Riva Speech Skills === ========================== NVIDIA Release 21.12 (build 30304767) Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved. Various files include modifications (c) NVIDIA CORPORATION. All rights reserved. NVIDIA modifications are covered by the license terms that apply to the underlying project or file. CUDA Capability Major/Minor version number: 7.5 Loading models from s3://tarteel-models/trtis-repo/1.8.0b0/75 Loading model: CnLg-SpeUni256-EATL1300-streaming > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:13.508607 78 metrics.cc:290] Collecting metrics for GPU 0: Tesla T4 I0302 15:25:13.515606 78 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so I0302 15:25:13.550529 78 onnxruntime.cc:1970] TRITONBACKEND_Initialize: onnxruntime I0302 15:25:13.550561 78 onnxruntime.cc:1980] Triton TRITONBACKEND API version: 1.4 I0302 15:25:13.550575 78 onnxruntime.cc:1986] 'onnxruntime' TRITONBACKEND API version: 1.4 I0302 15:25:13.753683 78 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f6914000000' with size 268435456 I0302 15:25:13.755203 78 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864 I0302 15:25:14.002457 78 backend_factory.h:45] Create TritonBackendFactory I0302 15:25:14.003820 78 plan_backend_factory.cc:49] Create PlanBackendFactory I0302 15:25:14.003842 78 plan_backend_factory.cc:56] Registering TensorRT Plugins I0302 15:25:14.003897 78 logging.cc:52] Registered plugin creator - ::GridAnchor_TRT version 1 I0302 15:25:14.003917 78 logging.cc:52] Registered plugin creator - ::GridAnchorRect_TRT version 1 I0302 15:25:14.003947 78 logging.cc:52] Registered plugin creator - ::NMS_TRT version 1 I0302 15:25:14.003964 78 logging.cc:52] Registered plugin creator - ::Reorg_TRT version 1 I0302 15:25:14.003976 78 logging.cc:52] Registered plugin creator - ::Region_TRT version 1 I0302 15:25:14.003992 78 logging.cc:52] Registered plugin creator - ::Clip_TRT version 1 I0302 15:25:14.004007 78 logging.cc:52] Registered plugin creator - ::LReLU_TRT version 1 I0302 15:25:14.004018 78 logging.cc:52] Registered plugin creator - ::PriorBox_TRT version 1 I0302 15:25:14.004036 78 logging.cc:52] Registered plugin creator - ::Normalize_TRT version 1 I0302 15:25:14.004049 78 logging.cc:52] Registered plugin creator - ::ScatterND version 1 I0302 15:25:14.004062 78 logging.cc:52] Registered plugin creator - ::RPROI_TRT version 1 I0302 15:25:14.004076 78 logging.cc:52] Registered plugin creator - ::BatchedNMS_TRT version 1 I0302 15:25:14.004086 78 logging.cc:52] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1 I0302 15:25:14.004094 78 logging.cc:52] Registered plugin creator - ::FlattenConcat_TRT version 1 I0302 15:25:14.004106 78 logging.cc:52] Registered plugin creator - ::CropAndResize version 1 I0302 15:25:14.004117 78 logging.cc:52] Registered plugin creator - ::DetectionLayer_TRT version 1 I0302 15:25:14.004127 78 logging.cc:52] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1 I0302 15:25:14.004146 78 logging.cc:52] Registered plugin creator - ::EfficientNMS_TRT version 1 I0302 15:25:14.004162 78 logging.cc:52] Registered plugin creator - ::Proposal version 1 I0302 15:25:14.004184 78 logging.cc:52] Registered plugin creator - ::ProposalLayer_TRT version 1 I0302 15:25:14.004199 78 logging.cc:52] Registered plugin creator - ::PyramidROIAlign_TRT version 1 I0302 15:25:14.004212 78 logging.cc:52] Registered plugin creator - ::ResizeNearest_TRT version 1 I0302 15:25:14.004228 78 logging.cc:52] Registered plugin creator - ::Split version 1 I0302 15:25:14.004243 78 logging.cc:52] Registered plugin creator - ::SpecialSlice_TRT version 1 I0302 15:25:14.004261 78 logging.cc:52] Registered plugin creator - ::InstanceNormalization_TRT version 1 I0302 15:25:14.004272 78 ensemble_backend_factory.cc:47] Create EnsembleBackendFactory > Riva waiting for Triton server to load all models...retrying in 1 second W0302 15:25:14.336722 78 autofill.cc:243] Proceeding with simple config for now I0302 15:25:14.336747 78 model_config_utils.cc:637] autofilled config: name: "CnLg-SpeUni256-EATL1300-streaming" platform: "ensemble" max_batch_size: 64 input { name: "AUDIO_SIGNAL" data_type: TYPE_FP32 dims: -1 } input { name: "SAMPLE_RATE" data_type: TYPE_UINT32 dims: 1 } input { name: "END_FLAG" data_type: TYPE_UINT32 dims: 1 } input { name: "CUSTOM_CONFIGURATION" data_type: TYPE_STRING dims: -1 dims: 2 } output { name: "FINAL_TRANSCRIPTS" data_type: TYPE_STRING dims: -1 } output { name: "FINAL_TRANSCRIPTS_SCORE" data_type: TYPE_FP32 dims: -1 } output { name: "FINAL_WORDS_START_END" data_type: TYPE_INT32 dims: -1 dims: 2 } output { name: "PARTIAL_TRANSCRIPTS" data_type: TYPE_STRING dims: -1 } output { name: "PARTIAL_TRANSCRIPTS_STABILITY" data_type: TYPE_FP32 dims: -1 } output { name: "PARTIAL_WORDS_START_END" data_type: TYPE_INT32 dims: -1 dims: 2 } output { name: "AUDIO_PROCESSED" data_type: TYPE_FP32 dims: 1 } parameters { key: "chunk_size" value { string_value: "1.2" } } parameters { key: "compute_timestamps" value { string_value: "True" } } parameters { key: "decoder_type" value { string_value: "greedy" } } parameters { key: "language_code" value { string_value: "ar-BH" } } parameters { key: "lattice_beam" value { string_value: "5" } } parameters { key: "left_padding_size" value { string_value: "2.4" } } parameters { key: "max_supported_transcripts" value { string_value: "1" } } parameters { key: "model_family" value { string_value: "riva" } } parameters { key: "ms_per_timestep" value { string_value: "80" } } parameters { key: "offline" value { string_value: "False" } } parameters { key: "right_padding_size" value { string_value: "2.4" } } parameters { key: "sample_rate" value { string_value: "16000" } } parameters { key: "streaming" value { string_value: "True" } } parameters { key: "type" value { string_value: "online" } } parameters { key: "vad" value { string_value: "True" } } ensemble_scheduling { step { model_name: "CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming" model_version: 1 input_map { key: "AUDIO_SIGNAL" value: "AUDIO_SIGNAL" } input_map { key: "SAMPLE_RATE" value: "SAMPLE_RATE" } output_map { key: "AUDIO_FEATURES" value: "AUDIO_FEATURES" } output_map { key: "AUDIO_PROCESSED" value: "AUDIO_PROCESSED" } } step { model_name: "riva-trt-CnLg-SpeUni256-EATL1300-streaming-am-streaming" model_version: 1 input_map { key: "audio_signal" value: "AUDIO_FEATURES" } output_map { key: "logprobs" value: "CHARACTER_PROBABILITIES" } } step { model_name: "CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming" model_version: 1 input_map { key: "CLASS_LOGITS" value: "CHARACTER_PROBABILITIES" } output_map { key: "SEGMENTS_START_END" value: "SEGMENTS_START_END" } } step { model_name: "CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming" model_version: 1 input_map { key: "CLASS_LOGITS" value: "CHARACTER_PROBABILITIES" } input_map { key: "CUSTOM_CONFIGURATION" value: "CUSTOM_CONFIGURATION" } input_map { key: "END_FLAG" value: "END_FLAG" } input_map { key: "SEGMENTS_START_END" value: "SEGMENTS_START_END" } output_map { key: "FINAL_TRANSCRIPTS" value: "FINAL_TRANSCRIPTS" } output_map { key: "FINAL_TRANSCRIPTS_SCORE" value: "FINAL_TRANSCRIPTS_SCORE" } output_map { key: "FINAL_WORDS_START_END" value: "FINAL_WORDS_START_END" } output_map { key: "PARTIAL_TRANSCRIPTS" value: "PARTIAL_TRANSCRIPTS" } output_map { key: "PARTIAL_TRANSCRIPTS_STABILITY" value: "PARTIAL_TRANSCRIPTS_STABILITY" } output_map { key: "PARTIAL_WORDS_START_END" value: "PARTIAL_WORDS_START_END" } } } I0302 15:25:14.853070 78 autofill.cc:138] TensorFlow SavedModel autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming', unable to find savedmodel directory named 'model.savedmodel' I0302 15:25:14.939666 78 autofill.cc:151] TensorFlow GraphDef autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming', unable to find graphdef file named 'model.graphdef' I0302 15:25:15.057584 78 autofill.cc:164] PyTorch autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming', unable to find PyTorch file named 'model.pt' > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:15.281224 78 autofill.cc:196] ONNX autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming', unable to find onnx file or directory named 'model.onnx' > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:16.911249 78 logging.cc:49] [MemUsageChange] Init CUDA: CPU +320, GPU +0, now: CPU 340, GPU 314 (MiB) I0302 15:25:16.914647 78 logging.cc:49] Loaded engine size: 0 MB I0302 15:25:16.914762 78 logging.cc:49] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 340 MiB, GPU 314 MiB E0302 15:25:16.934878 78 logging.cc:43] 1: [stdArchiveReader.cpp::StdArchiveReader::29] Error Code 1: Serialization (Serialization assertion magicTagRead == magicTag failed.Magic tag does not match) E0302 15:25:16.934920 78 logging.cc:43] 4: [runtime.cpp::deserializeCudaEngine::75] Error Code 4: Internal Error (Engine deserialization failed.) I0302 15:25:16.993806 78 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 340, GPU 314 (MiB) I0302 15:25:16.993842 78 logging.cc:49] Loaded engine size: 0 MB I0302 15:25:16.993910 78 logging.cc:49] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 340 MiB, GPU 314 MiB E0302 15:25:16.994315 78 logging.cc:43] 1: [stdArchiveReader.cpp::StdArchiveReader::29] Error Code 1: Serialization (Serialization assertion magicTagRead == magicTag failed.Magic tag does not match) E0302 15:25:16.994341 78 logging.cc:43] 4: [runtime.cpp::deserializeCudaEngine::75] Error Code 4: Internal Error (Engine deserialization failed.) I0302 15:25:16.994375 78 autofill.cc:209] TensorRT autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming', unable to find a compatible plan file. W0302 15:25:16.994388 78 autofill.cc:243] Proceeding with simple config for now I0302 15:25:16.994396 78 model_config_utils.cc:637] autofilled config: name: "CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming" max_batch_size: 2048 input { name: "CLASS_LOGITS" data_type: TYPE_FP32 dims: -1 dims: 257 } input { name: "END_FLAG" data_type: TYPE_UINT32 dims: 1 } input { name: "SEGMENTS_START_END" data_type: TYPE_INT32 dims: -1 dims: 2 } input { name: "CUSTOM_CONFIGURATION" data_type: TYPE_STRING dims: -1 dims: 2 } output { name: "FINAL_TRANSCRIPTS" data_type: TYPE_STRING dims: -1 } output { name: "FINAL_TRANSCRIPTS_SCORE" data_type: TYPE_FP32 dims: -1 } output { name: "FINAL_WORDS_START_END" data_type: TYPE_INT32 dims: -1 dims: 2 } output { name: "PARTIAL_TRANSCRIPTS" data_type: TYPE_STRING dims: -1 } output { name: "PARTIAL_TRANSCRIPTS_STABILITY" data_type: TYPE_FP32 dims: -1 } output { name: "PARTIAL_WORDS_START_END" data_type: TYPE_INT32 dims: -1 dims: 2 } instance_group { count: 1 kind: KIND_GPU } optimization { cuda { output_copy_stream: true } } sequence_batching { max_sequence_idle_microseconds: 60000000 control_input { name: "START" control { int32_false_true: 0 int32_false_true: 1 } } control_input { name: "READY" control { kind: CONTROL_SEQUENCE_READY int32_false_true: 0 int32_false_true: 1 } } control_input { name: "END" control { kind: CONTROL_SEQUENCE_END int32_false_true: 0 int32_false_true: 1 } } control_input { name: "CORRID" control { kind: CONTROL_SEQUENCE_CORRID data_type: TYPE_UINT64 } } oldest { max_candidate_sequences: 2048 preferred_batch_size: 32 preferred_batch_size: 64 max_queue_delay_microseconds: 1000 } } parameters { key: "asr_model_delay" value { string_value: "-1" } } parameters { key: "chunk_size" value { string_value: "1.2" } } parameters { key: "compute_timestamps" value { string_value: "True" } } parameters { key: "decoder_num_worker_threads" value { string_value: "-1" } } parameters { key: "decoder_type" value { string_value: "greedy" } } parameters { key: "left_padding_size" value { string_value: "2.4" } } parameters { key: "max_execution_batch_size" value { string_value: "1024" } } parameters { key: "max_supported_transcripts" value { string_value: "1" } } parameters { key: "ms_per_timestep" value { string_value: "80" } } parameters { key: "right_padding_size" value { string_value: "2.4" } } parameters { key: "streaming" value { string_value: "True" } } parameters { key: "use_subword" value { string_value: "True" } } parameters { key: "use_vad" value { string_value: "True" } } parameters { key: "vocab_file" value { string_value: "/data/models/1.8.0b0/CnLg-SpeUni256-EATL1300-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt" } } backend: "riva_asr_decoder" model_transaction_policy { } > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:17.342891 78 autofill.cc:138] TensorFlow SavedModel autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming', unable to find savedmodel directory named 'model.savedmodel' I0302 15:25:17.402536 78 autofill.cc:151] TensorFlow GraphDef autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming', unable to find graphdef file named 'model.graphdef' I0302 15:25:17.467620 78 autofill.cc:164] PyTorch autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming', unable to find PyTorch file named 'model.pt' I0302 15:25:17.545857 78 autofill.cc:196] ONNX autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming', unable to find onnx file or directory named 'model.onnx' I0302 15:25:17.613234 78 autofill.cc:209] TensorRT autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming', unable to find a compatible plan file. W0302 15:25:17.613260 78 autofill.cc:243] Proceeding with simple config for now I0302 15:25:17.613267 78 model_config_utils.cc:637] autofilled config: name: "CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming" max_batch_size: 2048 input { name: "AUDIO_SIGNAL" data_type: TYPE_FP32 dims: -1 } input { name: "SAMPLE_RATE" data_type: TYPE_UINT32 dims: 1 } output { name: "AUDIO_FEATURES" data_type: TYPE_FP32 dims: 80 dims: -1 } output { name: "AUDIO_PROCESSED" data_type: TYPE_FP32 dims: 1 } instance_group { count: 1 kind: KIND_GPU } optimization { cuda { output_copy_stream: true } } sequence_batching { max_sequence_idle_microseconds: 60000000 control_input { name: "START" control { int32_false_true: 0 int32_false_true: 1 } } control_input { name: "READY" control { kind: CONTROL_SEQUENCE_READY int32_false_true: 0 int32_false_true: 1 } } control_input { name: "END" control { kind: CONTROL_SEQUENCE_END int32_false_true: 0 int32_false_true: 1 } } control_input { name: "CORRID" control { kind: CONTROL_SEQUENCE_CORRID data_type: TYPE_UINT64 } } oldest { max_candidate_sequences: 2048 preferred_batch_size: 256 preferred_batch_size: 512 max_queue_delay_microseconds: 1000 } } parameters { key: "chunk_size" value { string_value: "1.2" } } parameters { key: "dither" value { string_value: "1e-05" } } parameters { key: "gain" value { string_value: "1.0" } } parameters { key: "left_padding_size" value { string_value: "2.4" } } parameters { key: "max_execution_batch_size" value { string_value: "1024" } } parameters { key: "mean" value { string_value: "-11.4412, -9.9334, -9.1292, -9.0365, -9.2804, -9.5643, -9.7342, -9.6925, -9.6333, -9.2808, -9.1887, -9.1422, -9.1397, -9.2028, -9.2749, -9.4776, -9.9185, -10.1557, -10.3800, -10.5067, -10.3190, -10.4728, -10.5529, -10.6402, -10.6440, -10.5113, -10.7395, -10.7870, -10.6074, -10.5033, -10.8278, -10.6384, -10.8481, -10.6875, -10.5454, -10.4747, -10.5165, -10.4930, -10.3413, -10.3472, -10.3735, -10.6830, -10.8813, -10.6338, -10.3856, -10.7727, -10.8957, -10.8068, -10.7373, -10.6108, -10.3405, -10.2889, -10.3922, -10.4946, -10.3367, -10.4164, -10.9949, -10.7196, -10.3971, -10.1734, -9.9257, -9.6557, -9.1761, -9.6653, -9.7876, -9.7230, -9.7792, -9.7056, -9.2702, -9.4650, -9.2755, -9.1369, -9.1174, -8.9197, -8.5394, -8.2614, -8.1353, -8.1422, -8.3430, -8.6655" } } parameters { key: "norm_per_feature" value { string_value: "True" } } parameters { key: "num_features" value { string_value: "80" } } parameters { key: "precalc_norm_params" value { string_value: "False" } } parameters { key: "precalc_norm_time_steps" value { string_value: "0" } } parameters { key: "right_padding_size" value { string_value: "2.4" } } parameters { key: "sample_rate" value { string_value: "16000" } } parameters { key: "stddev" value { string_value: "2.2668, 3.1642, 3.7079, 3.7642, 3.5349, 3.5901, 3.7640, 3.8424, 4.0145, 4.1475, 4.0457, 3.9048, 3.7709, 3.6117, 3.3188, 3.1489, 3.0615, 3.0362, 2.9929, 3.0500, 3.0341, 3.0484, 3.0103, 2.9474, 2.9128, 2.8669, 2.8332, 2.9411, 3.0378, 3.0712, 3.0190, 2.9992, 3.0124, 3.0024, 3.0275, 3.0870, 3.0656, 3.0142, 3.0493, 3.1373, 3.1135, 3.0675, 2.8828, 2.7018, 2.6296, 2.8826, 2.9325, 2.9288, 2.9271, 2.9890, 3.0137, 2.9855, 3.0839, 2.9319, 2.3512, 2.3795, 2.6191, 2.7555, 2.9326, 2.9931, 3.1543, 3.0855, 2.6820, 3.0566, 3.1272, 3.1663, 3.1836, 3.0018, 2.9089, 3.1727, 3.1626, 3.1086, 2.9804, 3.1107, 3.2998, 3.3697, 3.3716, 3.2487, 3.1597, 3.1181" } } parameters { key: "stddev_floor" value { string_value: "1e-05" } } parameters { key: "streaming" value { string_value: "True" } } parameters { key: "transpose" value { string_value: "False" } } parameters { key: "use_utterance_norm_params" value { string_value: "False" } } parameters { key: "window_size" value { string_value: "0.025" } } parameters { key: "window_stride" value { string_value: "0.01" } } backend: "riva_asr_features" model_transaction_policy { } I0302 15:25:18.239271 78 autofill.cc:138] TensorFlow SavedModel autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming', unable to find savedmodel directory named 'model.savedmodel' > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:18.378833 78 autofill.cc:151] TensorFlow GraphDef autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming', unable to find graphdef file named 'model.graphdef' I0302 15:25:18.462903 78 autofill.cc:164] PyTorch autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming', unable to find PyTorch file named 'model.pt' I0302 15:25:18.571008 78 autofill.cc:196] ONNX autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming', unable to find onnx file or directory named 'model.onnx' I0302 15:25:18.698008 78 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 340, GPU 314 (MiB) I0302 15:25:18.698042 78 logging.cc:49] Loaded engine size: 0 MB I0302 15:25:18.698109 78 logging.cc:49] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 340 MiB, GPU 314 MiB E0302 15:25:18.698500 78 logging.cc:43] 1: [stdArchiveReader.cpp::StdArchiveReader::29] Error Code 1: Serialization (Serialization assertion magicTagRead == magicTag failed.Magic tag does not match) E0302 15:25:18.698528 78 logging.cc:43] 4: [runtime.cpp::deserializeCudaEngine::75] Error Code 4: Internal Error (Engine deserialization failed.) I0302 15:25:18.698561 78 autofill.cc:209] TensorRT autofill: Internal: unable to autofill for 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming', unable to find a compatible plan file. W0302 15:25:18.698575 78 autofill.cc:243] Proceeding with simple config for now I0302 15:25:18.698589 78 model_config_utils.cc:637] autofilled config: name: "CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming" max_batch_size: 2048 input { name: "CLASS_LOGITS" data_type: TYPE_FP32 dims: -1 dims: 257 } output { name: "SEGMENTS_START_END" data_type: TYPE_INT32 dims: -1 dims: 2 } instance_group { count: 1 kind: KIND_CPU } optimization { cuda { output_copy_stream: true } } sequence_batching { max_sequence_idle_microseconds: 60000000 control_input { name: "START" control { int32_false_true: 0 int32_false_true: 1 } } control_input { name: "READY" control { kind: CONTROL_SEQUENCE_READY int32_false_true: 0 int32_false_true: 1 } } } parameters { key: "chunk_size" value { string_value: "1.2" } } parameters { key: "ms_per_timestep" value { string_value: "80" } } parameters { key: "residue_blanks_at_end" value { string_value: "0" } } parameters { key: "residue_blanks_at_start" value { string_value: "-2" } } parameters { key: "streaming" value { string_value: "True" } } parameters { key: "use_subword" value { string_value: "True" } } parameters { key: "vad_start_history" value { string_value: "300" } } parameters { key: "vad_start_th" value { string_value: "0.2" } } parameters { key: "vad_stop_history" value { string_value: "2400" } } parameters { key: "vad_stop_th" value { string_value: "0.98" } } parameters { key: "vad_type" value { string_value: "ctc-vad" } } parameters { key: "vocab_file" value { string_value: "/data/models/1.8.0b0/CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming/1/riva_decoder_vocabulary.txt" } } backend: "riva_asr_vad" model_transaction_policy { } > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:30.676788 78 logging.cc:49] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 923, GPU 314 (MiB) I0302 15:25:30.676837 78 logging.cc:49] Loaded engine size: 291 MB I0302 15:25:30.676907 78 logging.cc:49] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 923 MiB, GPU 314 MiB > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:33.633526 78 logging.cc:52] Using cublasLt a tactic source I0302 15:25:33.633630 78 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +491, GPU +212, now: CPU 1422, GPU 814 (MiB) I0302 15:25:33.633776 78 logging.cc:52] Using cuDNN as a tactic source > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:35.417042 78 logging.cc:49] [MemUsageChange] Init cuDNN: CPU +287, GPU +198, now: CPU 1709, GPU 1012 (MiB) I0302 15:25:35.419297 78 logging.cc:49] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1709, GPU 994 (MiB) I0302 15:25:35.419344 78 logging.cc:52] Deserialization required 4742097 microseconds. I0302 15:25:35.419426 78 logging.cc:49] [MemUsageSnapshot] deserializeCudaEngine end: CPU 1709 MiB, GPU 994 MiB I0302 15:25:35.439235 78 autofill.cc:209] TensorRT autofill: OK: I0302 15:25:35.439283 78 model_config_utils.cc:637] autofilled config: name: "riva-trt-CnLg-SpeUni256-EATL1300-streaming-am-streaming" platform: "tensorrt_plan" max_batch_size: 64 input { name: "audio_signal" data_type: TYPE_FP32 dims: 80 dims: 601 } output { name: "logprobs" data_type: TYPE_FP32 dims: 76 dims: 257 } instance_group { count: 2 kind: KIND_GPU } default_model_filename: "model.plan" dynamic_batching { preferred_batch_size: 32 preferred_batch_size: 64 max_queue_delay_microseconds: 1000 preserve_ordering: true } optimization { cuda { output_copy_stream: true } } model_transaction_policy { } I0302 15:25:35.439898 78 model_repository_manager.cc:749] AsyncLoad() 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming' > Riva waiting for Triton server to load all models...retrying in 1 second I0302 15:25:35.592525 78 model_repository_manager.cc:988] TriggerNextAction() 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming' version 1: 1 I0302 15:25:35.592554 78 model_repository_manager.cc:1026] Load() 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming' version 1 I0302 15:25:35.592567 78 model_repository_manager.cc:1045] loading: CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming:1 I0302 15:25:35.692779 78 model_repository_manager.cc:749] AsyncLoad() 'riva-trt-CnLg-SpeUni256-EATL1300-streaming-am-streaming' I0302 15:25:35.692780 78 model_repository_manager.cc:1105] CreateInferenceBackend() 'CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming' version 1 I0302 15:25:35.736442 78 model_repository_manager.cc:988] TriggerNextAction() 'riva-trt-CnLg-SpeUni256-EATL1300-streaming-am-streaming' version 1: 1 I0302 15:25:35.736470 78 model_repository_manager.cc:1026] Load() 'riva-trt-CnLg-SpeUni256-EATL1300-streaming-am-streaming' version 1 I0302 15:25:35.736482 78 model_repository_manager.cc:1045] loading: riva-trt-CnLg-SpeUni256-EATL1300-streaming-am-streaming:1 I0302 15:25:35.836659 78 model_repository_manager.cc:749] AsyncLoad() 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming' I0302 15:25:35.836680 78 model_repository_manager.cc:1105] CreateInferenceBackend() 'riva-trt-CnLg-SpeUni256-EATL1300-streaming-am-streaming' version 1 I0302 15:25:35.879046 78 model_repository_manager.cc:988] TriggerNextAction() 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming' version 1: 1 I0302 15:25:35.879079 78 model_repository_manager.cc:1026] Load() 'CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming' version 1 I0302 15:25:35.879092 78 model_repository_manager.cc:1045] loading: CnLg-SpeUni256-EATL1300-streaming-feature-extractor-streaming:1 I0302 15:25:35.937681 78 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/riva_asr_vad/libtriton_riva_asr_vad.so I0302 15:25:35.945204 78 vad_library.cc:18] TRITONBACKEND_ModelInitialize: CnLg-SpeUni256-EATL1300-streaming-voice-activity-detector-ctc-streaming (version 1) I0302 15:25:35.948423 78 model_config_utils.cc:1524] ModelConfig 64-bit fields: I0302 15:25:35.948441 78 model_config_utils.cc:1526] ModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds I0302 15:25:35.948455 78 model_config_utils.cc:1526] ModelConfig::dynamic_batching::max_queue_delay_microseconds I0302 15:25:35.948463 78 model_config_utils.cc:1526] ModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds I0302 15:25:35.948470 78 model_config_utils.cc:1526] ModelConfig::ensemble_scheduling::step::model_version I0302 15:25:35.948485 78 model_config_utils.cc:1526] ModelConfig::input::dims I0302 15:25:35.948504 78 model_config_utils.cc:1526] ModelConfig::input::reshape::shape I0302 15:25:35.948516 78 model_config_utils.cc:1526] ModelConfig::instance_group::secondary_devices::device_id I0302 15:25:35.948525 78 model_config_utils.cc:1526] ModelConfig::model_warmup::inputs::value::dims I0302 15:25:35.948536 78 model_config_utils.cc:1526] ModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim I0302 15:25:35.948550 78 model_config_utils.cc:1526] ModelConfig::optimization::cuda::graph_spec::input::value::dim I0302 15:25:35.948561 78 model_config_utils.cc:1526] ModelConfig::output::dims I0302 15:25:35.948570 78 model_config_utils.cc:1526] ModelConfig::output::reshape::shape I0302 15:25:35.948577 78 model_config_utils.cc:1526] ModelConfig::sequence_batching::direct::max_queue_delay_microseconds I0302 15:25:35.948583 78 model_config_utils.cc:1526] ModelConfig::sequence_batching::max_sequence_idle_microseconds I0302 15:25:35.948598 78 model_config_utils.cc:1526] ModelConfig::sequence_batching::oldest::max_queue_delay_microseconds I0302 15:25:35.948605 78 model_config_utils.cc:1526] ModelConfig::version_policy::specific::versions W:parameter_parser.cc:118: Parameter max_execution_batch_size could not be set from parameters W:parameter_parser.cc:119: Default value will be used W:parameter_parser.cc:118: Parameter max_execution_batch_size could not be set from parameters W:parameter_parser.cc:119: Default value will be used terminate called after throwing an instance of 'std::__cxx11::basic_string, std::allocator >' > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second > Riva waiting for Triton server to load all models...retrying in 1 second /opt/riva/start-riva.sh: line 4: 78 Aborted (core dumped) tritonserver --log-verbose=1 --log-info=true --log-warning=true --log-error=true --strict-model-config=false --model-control-mode=explicit "$LOAD_MODEL_STR" --model-repository "$MODEL_REPOSITORY" > Triton server died before reaching ready state. Terminating Riva startup.