Waiting for Jarvis server to load all models...retrying in 10 seconds

Hi all,

I could not get the bash jarvis_start.sh` executed successfully. I completed executing ‘bash jarvis_init.sh’ with the necessary models being downloaded and extracted. I am using Ubuntu 20.04 with RTX 3060, CPU AMD Ryzen 5800X.

The error I got is

Starting Jarvis Speech Services. This may take several minutes depending on the number of models deployed.
Waiting for Jarvis server to load all models...retrying in 10 seconds

This is the output of docker logs jarvis-speech.

==========================
== Jarvis Speech Skills ==
==========================

NVIDIA Release 20.11 (build 19933361)

Copyright (c) 2018-2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for the inference server.  NVIDIA recommends the use of the following flags:
   nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0422 02:00:23.852090 74 metrics.cc:219] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3060
I0422 02:00:23.969278 74 pinned_memory_manager.cc:199] Pinned memory pool is created at '0x7f7cc0000000' with size 268435456
I0422 02:00:23.969574 74 cuda_memory_manager.cc:99] CUDA memory pool is created on device 0 with size 1000000000
I0422 02:00:23.980896 74 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_ner-nn-bert-base-uncased:1
I0422 02:00:23.980974 74 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased:1
I0422 02:00:23.981026 74 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased:1
I0422 02:00:23.981072 74 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_qa-nn-bert-base-uncased:1
I0422 02:00:23.981126 74 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased:1
I0422 02:00:23.981187 74 model_repository_manager.cc:810] loading: jarvis-trt-jasper:1
I0422 02:00:23.981268 74 model_repository_manager.cc:810] loading: jarvis-trt-tacotron2_encoder:1
I0422 02:00:23.981347 74 model_repository_manager.cc:810] loading: jarvis-trt-waveglow:1
I0422 02:00:23.981495 74 model_repository_manager.cc:810] loading: jarvis_detokenize:1
I0422 02:00:23.981597 74 model_repository_manager.cc:810] loading: jarvis_label_tokens_weather:1
I0422 02:00:23.981720 74 model_repository_manager.cc:810] loading: jarvis_ner_label_tokens:1
I0422 02:00:23.981801 74 custom_backend.cc:198] Creating instance jarvis_detokenize_0_0_cpu on CPU using libtriton_jarvis_nlp_detokenizer.so
I0422 02:00:23.981809 74 model_repository_manager.cc:810] loading: jarvis_punctuation_gen_output:1
I0422 02:00:23.981882 74 model_repository_manager.cc:810] loading: jarvis_punctuation_label_tokens_cap:1
I0422 02:00:23.981962 74 model_repository_manager.cc:810] loading: jarvis_punctuation_label_tokens_punct:1
I0422 02:00:23.982048 74 model_repository_manager.cc:810] loading: jarvis_punctuation_merge_labels:1
I0422 02:00:23.982145 74 model_repository_manager.cc:810] loading: jarvis_qa_postprocessor:1
I0422 02:00:23.982194 74 custom_backend.cc:198] Creating instance jarvis_ner_label_tokens_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0422 02:00:23.982236 74 model_repository_manager.cc:810] loading: jarvis_qa_preprocessor:1
I0422 02:00:23.982322 74 model_repository_manager.cc:810] loading: jarvis_tokenizer:1
I0422 02:00:23.982414 74 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0422 02:00:23.982512 74 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0422 02:00:23.982618 74 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0422 02:00:23.982722 74 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0422 02:00:23.982818 74 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0422 02:00:23.982901 74 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0422 02:00:23.982987 74 model_repository_manager.cc:810] loading: tacotron2_decoder_postnet:1
I0422 02:00:23.982993 74 custom_backend.cc:198] Creating instance jarvis_punctuation_gen_output_0_0_cpu on CPU using libtriton_jarvis_nlp_punctuation.so
I0422 02:00:23.983081 74 model_repository_manager.cc:810] loading: tts_preprocessor:1
I0422 02:00:23.983177 74 model_repository_manager.cc:810] loading: waveglow_denoiser:1
I0422 02:00:23.983547 74 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_cap_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0422 02:00:23.983716 74 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_punct_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0422 02:00:23.984161 74 custom_backend.cc:198] Creating instance jarvis_punctuation_merge_labels_0_0_cpu on CPU using libtriton_jarvis_nlp_labels.so
I0422 02:00:23.984475 74 custom_backend.cc:198] Creating instance jarvis_qa_postprocessor_0_0_cpu on CPU using libtriton_jarvis_nlp_qa.so
I0422 02:00:23.984746 74 custom_backend.cc:198] Creating instance jarvis_qa_preprocessor_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0422 02:00:23.984953 74 custom_backend.cc:198] Creating instance jarvis_tokenizer_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0422 02:00:23.985166 74 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0422 02:00:23.985463 74 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0422 02:00:23.985704 74 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0422 02:00:23.985819 74 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0422 02:00:23.986227 74 custom_backend.cc:198] Creating instance jarvis_label_tokens_weather_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0422 02:00:23.986778 74 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_asr_features.so
I0422 02:00:23.986917 74 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_asr_features.so
I0422 02:00:23.987047 74 custom_backend.cc:201] Creating instance tts_preprocessor_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_tts_preprocessor.so
I0422 02:00:23.987074 74 custom_backend.cc:201] Creating instance waveglow_denoiser_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_tts_denoiser.so
I0422 02:00:24.102499 74 model_repository_manager.cc:983] successfully loaded 'tts_preprocessor' version 1
I0422 02:00:24.118252 74 model_repository_manager.cc:983] successfully loaded 'jarvis_punctuation_gen_output' version 1
I0422 02:00:24.118572 74 model_repository_manager.cc:983] successfully loaded 'jarvis_detokenize' version 1
I0422 02:00:24.130729 74 tacotron-decoder-postnet.cc:870] TRITONBACKEND_ModelInitialize: tacotron2_decoder_postnet (version 1)
I0422 02:00:24.130920 74 model_repository_manager.cc:983] successfully loaded 'jarvis_punctuation_merge_labels' version 1
I0422 02:00:24.130924 74 model_repository_manager.cc:983] successfully loaded 'jarvis_punctuation_label_tokens_cap' version 1
I0422 02:00:24.130937 74 model_repository_manager.cc:983] successfully loaded 'jarvis_punctuation_label_tokens_punct' version 1
I0422 02:00:24.131062 74 model_repository_manager.cc:983] successfully loaded 'jarvis_label_tokens_weather' version 1
I0422 02:00:24.131101 74 model_repository_manager.cc:983] successfully loaded 'jarvis_ner_label_tokens' version 1
I0422 02:00:24.131395 74 model_repository_manager.cc:983] successfully loaded 'jarvis_qa_postprocessor' version 1
I0422 02:00:24.152859 74 tacotron-decoder-postnet.cc:764] model configuration:
{
    "name": "tacotron2_decoder_postnet",
    "platform": "",
    "backend": "jarvis_tts_taco_postnet",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 8,
    "input": [
        {
            "name": "input_decoder",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                1,
                400,
                512
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        },
        {
            "name": "input_processed_decoder",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                400,
                128,
                1,
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        },
        {
            "name": "input_num_characters",
            "data_type": "TYPE_INT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        }
    ],
    "output": [
        {
            "name": "spectrogram_chunk",
            "data_type": "TYPE_FP32",
            "dims": [
                1,
                80,
                80
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "z",
            "data_type": "TYPE_FP32",
            "dims": [
                8,
                2656,
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "num_valid_samples",
            "data_type": "TYPE_INT32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "end_flag",
            "data_type": "TYPE_INT32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        }
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 8,
            "preferred_batch_size": [
                8
            ],
            "max_queue_delay_microseconds": 100
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ]
    },
    "instance_group": [
        {
            "name": "tacotron2_decoder_postnet_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "profile": []
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "num_samples_per_frame": {
            "string_value": "256"
        },
        "z_dim0": {
            "string_value": "8"
        },
        "z_dim1": {
            "string_value": "2656"
        },
        "tacotron_decoder_engine": {
            "string_value": "/data/models/tacotron2_decoder_postnet/1/model.plan"
        },
        "num_mels": {
            "string_value": "80"
        },
        "encoding_dimension": {
            "string_value": "512"
        },
        "max_execution_batch_size": {
            "string_value": "8"
        },
        "chunk_length": {
            "string_value": "80"
        },
        "max_input_length": {
            "string_value": "400"
        },
        "attention_dimension": {
            "string_value": "128"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": true
    }
}
I0422 02:00:24.153068 74 tacotron-decoder-postnet.cc:927] TRITONBACKEND_ModelInstanceInitialize: tacotron2_decoder_postnet_0 (device 0)
I0422 02:00:24.161061 74 model_repository_manager.cc:983] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming' version 1
I0422 02:00:24.176291 74 model_repository_manager.cc:983] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline' version 1
I0422 02:00:24.192692 74 model_repository_manager.cc:983] successfully loaded 'jarvis_tokenizer' version 1
I0422 02:00:24.209160 74 model_repository_manager.cc:983] successfully loaded 'jarvis_qa_preprocessor' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0422 02:00:29.718572 74 model_repository_manager.cc:983] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline' version 1
I0422 02:00:29.739433 74 model_repository_manager.cc:983] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
[E] [TRT] INVALID_CONFIG: The engine plan file is not compatible with this version of TensorRT, expecting library version 7.2.1 got 7.2.2, please rebuild.
[E] [TRT] engine.cpp (1646) - Serialization Error in deserialize: 0 (Core engine deserialization failure)
[E] [TRT] INVALID_STATE: std::exception
[E] [TRT] INVALID_CONFIG: Deserialize the cuda engine failed.
WARNING: Failed to load denoiser: Failed to deserialize engine.
E0422 02:00:35.670627 74 dynamic_batch_scheduler.cc:248] Initialization failed for dynamic-batch scheduler thread 0: initialize error for 'waveglow_denoiser': (23) unable to load denoiser model
E0422 02:00:35.670744 74 sequence_batch_scheduler.cc:1286] failed creating dynamic sequence batcher for OldestFirst 0: Initialization failed for all dynamic-batch scheduler threads
E0422 02:00:35.670995 74 model_repository_manager.cc:986] failed to load 'waveglow_denoiser' version 1: Internal: Initialization failed for all sequence-batch scheduler threads
[03/22/2021-02:00:35] [03/22/2021-02:00:35] [03/22/2021-02:00:35] [03/22/2021-02:00:35]   > Jarvis waiting for Triton server to load all models...retrying in 1 second
/opt/jarvis/bin/start-jarvis: line 4:    74 Segmentation fault      (core dumped) tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000
  > Triton server died before reaching ready state. Terminating Jarvis startup.
Check Triton logs with: docker logs 
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]

Hi @sonle,

Are you using the default config.sh file or any changes are made? Could you please share the updated config.sh file in case any customization are done?

Thanks

Hi @SunilJB

I was using the default config.sh file, not made any changes yet.

Hi @sonle,

Could you please try to run jarvis_clean.sh and then jarvis_init.sh? It seems you may have downloaded a new version and perhaps trying to install over top of an old version.

Thanks

Hi @SunilJB ,

I run jarvis_clean.sh and jarvis_init.sh again as your advice but the problem still persisted. To play safe, I have started triton inference server as I thought that jarvis needs that server to be started first. However, that does not help.

This is the output of docker logs jarvis-speech

==========================
== Jarvis Speech Skills ==
==========================

NVIDIA Release 21.03 (build 21236204)

Copyright (c) 2018-2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for the inference server.  NVIDIA recommends the use of the following flags:
   nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:03.362996 69 metrics.cc:221] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3060
I0426 03:43:03.372579 69 onnxruntime.cc:1728] TRITONBACKEND_Initialize: onnxruntime
I0426 03:43:03.372705 69 onnxruntime.cc:1738] Triton TRITONBACKEND API version: 1.0
I0426 03:43:03.372710 69 onnxruntime.cc:1744] 'onnxruntime' TRITONBACKEND API version: 1.0
I0426 03:43:03.486850 69 pinned_memory_manager.cc:205] Pinned memory pool is created at '0x7f173c000000' with size 268435456
I0426 03:43:03.489781 69 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 1000000000
I0426 03:43:03.501892 69 model_repository_manager.cc:787] loading: jarvis-trt-jarvis_ner-nn-bert-base-uncased:1
I0426 03:43:03.602031 69 model_repository_manager.cc:787] loading: jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased:1
I0426 03:43:03.702223 69 model_repository_manager.cc:787] loading: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased:1
I0426 03:43:03.802423 69 model_repository_manager.cc:787] loading: jarvis-trt-jarvis_qa-nn-bert-base-uncased:1
I0426 03:43:03.902638 69 model_repository_manager.cc:787] loading: jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased:1
I0426 03:43:04.002959 69 model_repository_manager.cc:787] loading: jarvis-trt-jasper:1
I0426 03:43:04.112427 69 model_repository_manager.cc:787] loading: jarvis-trt-tacotron2_encoder:1
I0426 03:43:04.213004 69 model_repository_manager.cc:787] loading: jarvis-trt-waveglow:1
I0426 03:43:04.313419 69 model_repository_manager.cc:787] loading: jarvis_detokenize:1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:04.413959 69 model_repository_manager.cc:787] loading: jarvis_label_tokens_weather:1
I0426 03:43:04.414832 69 custom_backend.cc:198] Creating instance jarvis_detokenize_0_0_cpu on CPU using libtriton_jarvis_nlp_detokenizer.so
I0426 03:43:04.421132 69 model_repository_manager.cc:960] successfully loaded 'jarvis_detokenize' version 1
I0426 03:43:04.523099 69 model_repository_manager.cc:787] loading: jarvis_ner_label_tokens:1
I0426 03:43:04.523658 69 custom_backend.cc:198] Creating instance jarvis_label_tokens_weather_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0426 03:43:04.528921 69 model_repository_manager.cc:960] successfully loaded 'jarvis_label_tokens_weather' version 1
I0426 03:43:04.624038 69 model_repository_manager.cc:787] loading: jarvis_punctuation_gen_output:1
I0426 03:43:04.624663 69 custom_backend.cc:198] Creating instance jarvis_ner_label_tokens_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0426 03:43:04.626009 69 model_repository_manager.cc:960] successfully loaded 'jarvis_ner_label_tokens' version 1
I0426 03:43:04.724631 69 model_repository_manager.cc:787] loading: jarvis_punctuation_label_tokens_cap:1
I0426 03:43:04.725158 69 custom_backend.cc:198] Creating instance jarvis_punctuation_gen_output_0_0_cpu on CPU using libtriton_jarvis_nlp_punctuation.so
I0426 03:43:04.730624 69 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_gen_output' version 1
I0426 03:43:04.825072 69 model_repository_manager.cc:787] loading: jarvis_punctuation_label_tokens_punct:1
I0426 03:43:04.825271 69 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_cap_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0426 03:43:04.825661 69 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_label_tokens_cap' version 1
I0426 03:43:04.925662 69 model_repository_manager.cc:787] loading: jarvis_punctuation_merge_labels:1
I0426 03:43:04.926025 69 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_punct_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0426 03:43:04.926517 69 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_label_tokens_punct' version 1
I0426 03:43:05.037309 69 model_repository_manager.cc:787] loading: jarvis_qa_postprocessor:1
I0426 03:43:05.037469 69 custom_backend.cc:198] Creating instance jarvis_punctuation_merge_labels_0_0_cpu on CPU using libtriton_jarvis_nlp_labels.so
I0426 03:43:05.042246 69 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_merge_labels' version 1
I0426 03:43:05.138124 69 model_repository_manager.cc:787] loading: jarvis_qa_preprocessor:1
I0426 03:43:05.138506 69 custom_backend.cc:198] Creating instance jarvis_qa_postprocessor_0_0_cpu on CPU using libtriton_jarvis_nlp_qa.so
I0426 03:43:05.158650 69 model_repository_manager.cc:960] successfully loaded 'jarvis_qa_postprocessor' version 1
I0426 03:43:05.242113 69 model_repository_manager.cc:787] loading: jarvis_tokenizer:1
I0426 03:43:05.242333 69 custom_backend.cc:198] Creating instance jarvis_qa_preprocessor_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0426 03:43:05.278301 69 model_repository_manager.cc:960] successfully loaded 'jarvis_qa_preprocessor' version 1
I0426 03:43:05.342568 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0426 03:43:05.342891 69 custom_backend.cc:198] Creating instance jarvis_tokenizer_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0426 03:43:05.357974 69 model_repository_manager.cc:960] successfully loaded 'jarvis_tokenizer' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:05.442986 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0426 03:43:05.443229 69 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0426 03:43:05.543383 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0426 03:43:05.543721 69 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_asr_features.so
I0426 03:43:05.643612 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0426 03:43:05.643729 69 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0426 03:43:05.743804 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0426 03:43:05.744051 69 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_asr_features.so
I0426 03:43:05.843966 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0426 03:43:05.844122 69 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0426 03:43:05.860973 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline' version 1
I0426 03:43:05.944180 69 model_repository_manager.cc:787] loading: tacotron2_decoder_postnet:1
I0426 03:43:05.944337 69 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0426 03:43:05.946374 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming' version 1
I0426 03:43:06.044434 69 model_repository_manager.cc:787] loading: tts_preprocessor:1
I0426 03:43:06.144664 69 model_repository_manager.cc:787] loading: waveglow_denoiser:1
I0426 03:43:06.144897 69 custom_backend.cc:201] Creating instance tts_preprocessor_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_tts_preprocessor.so
I0426 03:43:06.245182 69 custom_backend.cc:201] Creating instance waveglow_denoiser_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_tts_denoiser.so
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:08.939649 69 tacotron-decoder-postnet.cc:873] TRITONBACKEND_ModelInitialize: tacotron2_decoder_postnet (version 1)
I0426 03:43:08.940939 69 tacotron-decoder-postnet.cc:767] model configuration:
{
    "name": "tacotron2_decoder_postnet",
    "platform": "",
    "backend": "jarvis_tts_taco_postnet",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 8,
    "input": [
        {
            "name": "input_decoder",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                1,
                400,
                512
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        },
        {
            "name": "input_processed_decoder",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                400,
                128,
                1,
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        },
        {
            "name": "input_num_characters",
            "data_type": "TYPE_INT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        }
    ],
    "output": [
        {
            "name": "spectrogram_chunk",
            "data_type": "TYPE_FP32",
            "dims": [
                1,
                80,
                80
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "z",
            "data_type": "TYPE_FP32",
            "dims": [
                8,
                2656,
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "num_valid_samples",
            "data_type": "TYPE_INT32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "end_flag",
            "data_type": "TYPE_INT32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        }
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 8,
            "preferred_batch_size": [
                8
            ],
            "max_queue_delay_microseconds": 100
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ]
    },
    "instance_group": [
        {
            "name": "tacotron2_decoder_postnet_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "profile": []
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "z_dim0": {
            "string_value": "8"
        },
        "encoding_dimension": {
            "string_value": "512"
        },
        "z_dim1": {
            "string_value": "2656"
        },
        "tacotron_decoder_engine": {
            "string_value": "/data/models/tacotron2_decoder_postnet/1/model.plan"
        },
        "num_mels": {
            "string_value": "80"
        },
        "max_execution_batch_size": {
            "string_value": "8"
        },
        "chunk_length": {
            "string_value": "80"
        },
        "max_input_length": {
            "string_value": "400"
        },
        "attention_dimension": {
            "string_value": "128"
        },
        "num_samples_per_frame": {
            "string_value": "256"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": true
    }
}
I0426 03:43:08.941023 69 tacotron-decoder-postnet.cc:927] TRITONBACKEND_ModelInstanceInitialize: tacotron2_decoder_postnet_0 (device 0)
I0426 03:43:08.943365 69 model_repository_manager.cc:960] successfully loaded 'tts_preprocessor' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:09.668712 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline' version 1
I0426 03:43:09.684144 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:15.932501 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline' version 1
I0426 03:43:15.932755 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming' version 1
I0426 03:43:16.001809 69 plan_backend.cc:338] Creating instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 (8.6) using model.plan
I0426 03:43:16.033926 69 plan_backend.cc:675] Created instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 with stream priority 0
I0426 03:43:16.034130 69 model_repository_manager.cc:960] successfully loaded 'waveglow_denoiser' version 1
I0426 03:43:16.038848 69 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-tacotron2_encoder' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:16.998058 69 plan_backend.cc:338] Creating instance jarvis-trt-jarvis_ner-nn-bert-base-uncased_0_0_gpu0 on GPU 0 (8.6) using model.plan
I0426 03:43:17.133542 69 model_repository_manager.cc:960] successfully loaded 'tacotron2_decoder_postnet' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:17.975898 69 plan_backend.cc:671] Created instance jarvis-trt-jarvis_ner-nn-bert-base-uncased_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0426 03:43:17.999371 69 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jarvis_ner-nn-bert-base-uncased' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:18.943603 69 plan_backend.cc:338] Creating instance jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:19.926457 69 plan_backend.cc:671] Created instance jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0426 03:43:19.949852 69 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:20.888272 69 plan_backend.cc:338] Creating instance jarvis-trt-jarvis_punctuation-nn-bert-base-uncased_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:21.860961 69 plan_backend.cc:671] Created instance jarvis-trt-jarvis_punctuation-nn-bert-base-uncased_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0426 03:43:21.883874 69 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jarvis_punctuation-nn-bert-base-uncased' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:22.828131 69 plan_backend.cc:338] Creating instance jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:23.808336 69 plan_backend.cc:671] Created instance jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0426 03:43:23.830485 69 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:24.651749 69 plan_backend.cc:338] Creating instance jarvis-trt-waveglow_0_0_gpu0 on GPU 0 (8.6) using model.plan
E0426 03:43:25.471762 69 logging.cc:43] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
E0426 03:43:25.475928 69 logging.cc:43] FAILED_ALLOCATION: std::exception
E0426 03:43:25.519403 69 model_repository_manager.cc:963] failed to load 'jarvis-trt-waveglow' version 1: Internal: unable to create TensorRT context
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:28.316091 69 plan_backend.cc:338] Creating instance jarvis-trt-jasper_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:31.150290 69 plan_backend.cc:671] Created instance jarvis-trt-jasper_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0426 03:43:31.174212 69 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jasper' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:32.117997 69 plan_backend.cc:338] Creating instance jarvis-trt-jarvis_qa-nn-bert-base-uncased_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:33.090718 69 plan_backend.cc:671] Created instance jarvis-trt-jarvis_qa-nn-bert-base-uncased_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0426 03:43:33.113025 69 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jarvis_qa-nn-bert-base-uncased' version 1
E0426 03:43:33.113853 69 model_repository_manager.cc:1160] Invalid argument: ensemble 'tacotron2_ensemble' depends on 'jarvis-trt-waveglow' which has no loaded version
I0426 03:43:33.113930 69 model_repository_manager.cc:787] loading: jarvis_intent_weather:1
I0426 03:43:33.214049 69 model_repository_manager.cc:787] loading: jarvis_ner:1
I0426 03:43:33.314150 69 model_repository_manager.cc:960] successfully loaded 'jarvis_intent_weather' version 1
I0426 03:43:33.314192 69 model_repository_manager.cc:787] loading: jarvis_punctuation:1
I0426 03:43:33.414283 69 model_repository_manager.cc:960] successfully loaded 'jarvis_ner' version 1
I0426 03:43:33.414308 69 model_repository_manager.cc:787] loading: jarvis_qa:1
I0426 03:43:33.514414 69 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation' version 1
I0426 03:43:33.514431 69 model_repository_manager.cc:787] loading: jarvis_text_classification_domain:1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:33.614520 69 model_repository_manager.cc:960] successfully loaded 'jarvis_qa' version 1
I0426 03:43:33.614546 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming:1
I0426 03:43:33.614579 69 model_repository_manager.cc:960] successfully loaded 'jarvis_text_classification_domain' version 1
I0426 03:43:33.714658 69 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline:1
I0426 03:43:33.814743 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming' version 1
I0426 03:43:33.814800 69 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline' version 1
I0426 03:43:33.814864 69 server.cc:495] 
+-------------------------+-----------------------------------------------------------------------------------------+------+
| Backend                 | Config                                                                                  | Path |
+-------------------------+-----------------------------------------------------------------------------------------+------+
| onnxruntime             | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so                         | {}   |
| jarvis_tts_taco_postnet | /opt/tritonserver/backends/jarvis_tts_taco_postnet/libtriton_jarvis_tts_taco_postnet.so | {}   |
+-------------------------+-----------------------------------------------------------------------------------------+------+

I0426 03:43:33.814998 69 server.cc:538] 
+---------------------------------------------------------------------------------------------+---------+----------------------------------------------------------+
| Model                                                                                       | Version | Status                                                   |
+---------------------------------------------------------------------------------------------+---------+----------------------------------------------------------+
| jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased                                       | 1       | READY                                                    |
| jarvis-trt-jarvis_ner-nn-bert-base-uncased                                                  | 1       | READY                                                    |
| jarvis-trt-jarvis_punctuation-nn-bert-base-uncased                                          | 1       | READY                                                    |
| jarvis-trt-jarvis_qa-nn-bert-base-uncased                                                   | 1       | READY                                                    |
| jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased                           | 1       | READY                                                    |
| jarvis-trt-jasper                                                                           | 1       | READY                                                    |
| jarvis-trt-tacotron2_encoder                                                                | 1       | READY                                                    |
| jarvis-trt-waveglow                                                                         | 1       | UNAVAILABLE: Internal: unable to create TensorRT context |
| jarvis_detokenize                                                                           | 1       | READY                                                    |
| jarvis_intent_weather                                                                       | 1       | READY                                                    |
| jarvis_label_tokens_weather                                                                 | 1       | READY                                                    |
| jarvis_ner                                                                                  | 1       | READY                                                    |
| jarvis_ner_label_tokens                                                                     | 1       | READY                                                    |
| jarvis_punctuation                                                                          | 1       | READY                                                    |
| jarvis_punctuation_gen_output                                                               | 1       | READY                                                    |
| jarvis_punctuation_label_tokens_cap                                                         | 1       | READY                                                    |
| jarvis_punctuation_label_tokens_punct                                                       | 1       | READY                                                    |
| jarvis_punctuation_merge_labels                                                             | 1       | READY                                                    |
| jarvis_qa                                                                                   | 1       | READY                                                    |
| jarvis_qa_postprocessor                                                                     | 1       | READY                                                    |
| jarvis_qa_preprocessor                                                                      | 1       | READY                                                    |
| jarvis_text_classification_domain                                                           | 1       | READY                                                    |
| jarvis_tokenizer                                                                            | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming                                                       | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming                             | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming                           | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming-offline                                               | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline             | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline           | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline | 1       | READY                                                    |
| jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming                 | 1       | READY                                                    |
| tacotron2_decoder_postnet                                                                   | 1       | READY                                                    |
| tacotron2_ensemble                                                                          | -       | Not loaded: No model version was found                   |
| tts_preprocessor                                                                            | 1       | READY                                                    |
| waveglow_denoiser                                                                           | 1       | READY                                                    |
+---------------------------------------------------------------------------------------------+---------+----------------------------------------------------------+

I0426 03:43:33.815070 69 tritonserver.cc:1642] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                              |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                             |
| server_version                   | 2.7.0                                                                                                                                              |
| server_extensions                | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /data/models                                                                                                                                       |
| model_control_mode               | MODE_NONE                                                                                                                                          |
| strict_model_config              | 1                                                                                                                                                  |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                          |
| cuda_memory_pool_byte_size{0}    | 1000000000                                                                                                                                         |
| min_supported_compute_capability | 6.0                                                                                                                                                |
| strict_readiness                 | 1                                                                                                                                                  |
| exit_timeout                     | 30                                                                                                                                                 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+

I0426 03:43:33.815074 69 server.cc:220] Waiting for in-flight requests to complete.
I0426 03:43:33.815077 69 model_repository_manager.cc:820] unloading: waveglow_denoiser:1
I0426 03:43:33.815095 69 model_repository_manager.cc:820] unloading: tts_preprocessor:1
I0426 03:43:33.815132 69 model_repository_manager.cc:820] unloading: tacotron2_decoder_postnet:1
I0426 03:43:33.815167 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0426 03:43:33.815234 69 model_repository_manager.cc:820] unloading: jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased:1
I0426 03:43:33.815296 69 model_repository_manager.cc:820] unloading: jarvis-trt-jarvis_qa-nn-bert-base-uncased:1
I0426 03:43:33.815301 69 tacotron-decoder-postnet.cc:1000] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0426 03:43:33.815333 69 model_repository_manager.cc:820] unloading: jarvis-trt-jarvis_ner-nn-bert-base-uncased:1
I0426 03:43:33.815372 69 model_repository_manager.cc:820] unloading: jarvis-trt-jasper:1
I0426 03:43:33.815407 69 model_repository_manager.cc:820] unloading: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased:1
I0426 03:43:33.815440 69 model_repository_manager.cc:820] unloading: jarvis_punctuation_label_tokens_punct:1
I0426 03:43:33.815470 69 model_repository_manager.cc:820] unloading: jarvis_ner_label_tokens:1
I0426 03:43:33.815506 69 model_repository_manager.cc:820] unloading: jarvis_qa:1
I0426 03:43:33.815550 69 model_repository_manager.cc:820] unloading: jarvis_label_tokens_weather:1
I0426 03:43:33.815584 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0426 03:43:33.815594 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_qa' version 1
I0426 03:43:33.815618 69 model_repository_manager.cc:820] unloading: jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased:1
I0426 03:43:33.815709 69 model_repository_manager.cc:943] successfully unloaded 'tts_preprocessor' version 1
I0426 03:43:33.815714 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_punctuation_label_tokens_punct' version 1
I0426 03:43:33.815716 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_ner_label_tokens' version 1
I0426 03:43:33.815723 69 model_repository_manager.cc:820] unloading: jarvis_punctuation:1I0426 03:43:33.815719 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_label_tokens_weather' version 1

I0426 03:43:33.815769 69 model_repository_manager.cc:820] unloading: jarvis_intent_weather:1
I0426 03:43:33.815782 69 model_repository_manager.cc:820] unloading: jarvis_punctuation_gen_output:1
I0426 03:43:33.815842 69 model_repository_manager.cc:820] unloading: jarvis_punctuation_label_tokens_cap:1
I0426 03:43:33.815888 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_intent_weather' version 1
I0426 03:43:33.815905 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0426 03:43:33.815928 69 model_repository_manager.cc:820] unloading: jarvis_punctuation_merge_labels:1
I0426 03:43:33.815960 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_punctuation' version 1
I0426 03:43:33.816001 69 model_repository_manager.cc:820] unloading: jarvis_qa_postprocessor:1
I0426 03:43:33.816022 69 model_repository_manager.cc:820] unloading: jarvis_ner:1
I0426 03:43:33.816090 69 model_repository_manager.cc:820] unloading: jarvis_qa_preprocessor:1
I0426 03:43:33.816127 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0426 03:43:33.816144 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_ner' version 1
I0426 03:43:33.816174 69 model_repository_manager.cc:820] unloading: jarvis_detokenize:1
I0426 03:43:33.816196 69 model_repository_manager.cc:820] unloading: jarvis_text_classification_domain:1
I0426 03:43:33.816230 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_punctuation_gen_output' version 1
I0426 03:43:33.816233 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline' version 1
I0426 03:43:33.816370 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_punctuation_label_tokens_cap' version 1
I0426 03:43:33.816413 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0426 03:43:33.816436 69 model_repository_manager.cc:820] unloading: jarvis-trt-tacotron2_encoder:1
I0426 03:43:33.816500 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_punctuation_merge_labels' version 1
I0426 03:43:33.816505 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_text_classification_domain' version 1
I0426 03:43:33.816574 69 model_repository_manager.cc:820] unloading: jarvis_tokenizer:1
I0426 03:43:33.816647 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0426 03:43:33.816781 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_qa_postprocessor' version 1
I0426 03:43:33.816890 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_detokenize' version 1
I0426 03:43:33.816921 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming:1
I0426 03:43:33.819063 69 model_repository_manager.cc:820] unloading: jasper-asr-trt-ensemble-vad-streaming-offline:1
I0426 03:43:33.823396 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming' version 1
I0426 03:43:33.856500 69 server.cc:235] Timeout 30: Found 17 live models and 0 in-flight non-inference requests
I0426 03:43:33.894382 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming-offline' version 1
I0426 03:43:33.906905 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming' version 1
I0426 03:43:33.906909 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_tokenizer' version 1
I0426 03:43:33.908663 69 model_repository_manager.cc:943] successfully unloaded 'jarvis_qa_preprocessor' version 1
I0426 03:43:33.908664 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline' version 1
I0426 03:43:33.908690 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline' version 1
I0426 03:43:33.920788 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming' version 1
I0426 03:43:33.923958 69 model_repository_manager.cc:943] successfully unloaded 'jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming' version 1
I0426 03:43:33.943302 69 model_repository_manager.cc:943] successfully unloaded 'waveglow_denoiser' version 1
I0426 03:43:33.953706 69 model_repository_manager.cc:943] successfully unloaded 'jarvis-trt-tacotron2_encoder' version 1
I0426 03:43:33.960411 69 model_repository_manager.cc:943] successfully unloaded 'jarvis-trt-jasper' version 1
I0426 03:43:34.007466 69 model_repository_manager.cc:943] successfully unloaded 'jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased' version 1
I0426 03:43:34.022288 69 model_repository_manager.cc:943] successfully unloaded 'jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased' version 1
I0426 03:43:34.025184 69 model_repository_manager.cc:943] successfully unloaded 'jarvis-trt-jarvis_punctuation-nn-bert-base-uncased' version 1
I0426 03:43:34.026256 69 model_repository_manager.cc:943] successfully unloaded 'jarvis-trt-jarvis_ner-nn-bert-base-uncased' version 1
I0426 03:43:34.029118 69 model_repository_manager.cc:943] successfully unloaded 'jarvis-trt-jarvis_qa-nn-bert-base-uncased' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:34.856605 69 server.cc:235] Timeout 29: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:35.856691 69 server.cc:235] Timeout 28: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:36.856775 69 server.cc:235] Timeout 27: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:37.856865 69 server.cc:235] Timeout 26: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:38.856952 69 server.cc:235] Timeout 25: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:39.857030 69 server.cc:235] Timeout 24: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:40.857105 69 server.cc:235] Timeout 23: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:41.857176 69 server.cc:235] Timeout 22: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:42.857245 69 server.cc:235] Timeout 21: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:43.857318 69 server.cc:235] Timeout 20: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:44.857390 69 server.cc:235] Timeout 19: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:45.857459 69 server.cc:235] Timeout 18: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:46.857530 69 server.cc:235] Timeout 17: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:47.857619 69 server.cc:235] Timeout 16: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:48.857713 69 server.cc:235] Timeout 15: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:49.857801 69 server.cc:235] Timeout 14: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:50.857891 69 server.cc:235] Timeout 13: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:51.857969 69 server.cc:235] Timeout 12: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:52.858047 69 server.cc:235] Timeout 11: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:53.858116 69 server.cc:235] Timeout 10: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:54.858204 69 server.cc:235] Timeout 9: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:55.858292 69 server.cc:235] Timeout 8: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:56.858366 69 server.cc:235] Timeout 7: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:57.858434 69 server.cc:235] Timeout 6: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:58.858521 69 server.cc:235] Timeout 5: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:43:59.858614 69 server.cc:235] Timeout 4: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:44:00.858700 69 server.cc:235] Timeout 3: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:44:01.858776 69 server.cc:235] Timeout 2: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:44:02.858852 69 server.cc:235] Timeout 1: Found 1 live models and 0 in-flight non-inference requests
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0426 03:44:03.858946 69 server.cc:235] Timeout 0: Found 1 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Triton server died before reaching ready state. Terminating Jarvis startup.ed text by 4 spaces

Could you please try commenting out all the NLP models except 1 and see if that deploys successfully on your setup.

Thanks

Hi @SunilJB ,

After I commented out the NLP models in config.sh file as below, the Jarvis Speech server could be started successfully. Question - what is the limitation on the features on the Jarvis Speech server after those models are not enabled? Looks like there was an issue with the memory limitation when I enabled all the models. Is there a solution for this in order for me to enable all the NLP models?

models_nlp=(
    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_punctuation:${jarvis_ngc_model_version}"
    # "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_named_entity_recognition:${jarvis_ngc_model_version}"
    # "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_intent_slot:${jarvis_ngc_model_version}"
    # "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_question_answering:${jarvis_ngc_model_version}"
    # "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_text_classification:${jarvis_ngc_model_version}"
)

This is the output of docker logs jarvis-speech

==========================
== Jarvis Speech Skills ==
==========================

NVIDIA Release 21.03 (build 21236204)

Copyright (c) 2018-2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for the inference server.  NVIDIA recommends the use of the following flags:
   nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:49:55.965566 70 metrics.cc:221] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3060
I0430 04:49:55.974911 70 onnxruntime.cc:1728] TRITONBACKEND_Initialize: onnxruntime
I0430 04:49:55.975020 70 onnxruntime.cc:1738] Triton TRITONBACKEND API version: 1.0
I0430 04:49:55.975025 70 onnxruntime.cc:1744] 'onnxruntime' TRITONBACKEND API version: 1.0
I0430 04:49:56.086827 70 pinned_memory_manager.cc:205] Pinned memory pool is created at '0x7f69cc000000' with size 268435456
I0430 04:49:56.089898 70 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 1000000000
I0430 04:49:56.099202 70 model_repository_manager.cc:787] loading: jarvis-trt-jasper:1
I0430 04:49:56.199332 70 model_repository_manager.cc:787] loading: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased:1
I0430 04:49:56.299440 70 model_repository_manager.cc:787] loading: jarvis-trt-tacotron2_encoder:1
I0430 04:49:56.399644 70 model_repository_manager.cc:787] loading: jarvis-trt-waveglow:1
I0430 04:49:56.500002 70 model_repository_manager.cc:787] loading: jarvis_detokenize:1
I0430 04:49:56.600339 70 model_repository_manager.cc:787] loading: jarvis_punctuation_gen_output:1
I0430 04:49:56.600630 70 custom_backend.cc:198] Creating instance jarvis_detokenize_0_0_cpu on CPU using libtriton_jarvis_nlp_detokenizer.so
I0430 04:49:56.608309 70 model_repository_manager.cc:960] successfully loaded 'jarvis_detokenize' version 1
I0430 04:49:56.700616 70 model_repository_manager.cc:787] loading: jarvis_punctuation_label_tokens_cap:1
I0430 04:49:56.700864 70 custom_backend.cc:198] Creating instance jarvis_punctuation_gen_output_0_0_cpu on CPU using libtriton_jarvis_nlp_punctuation.so
I0430 04:49:56.706384 70 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_gen_output' version 1
I0430 04:49:56.800902 70 model_repository_manager.cc:787] loading: jarvis_punctuation_label_tokens_punct:1
I0430 04:49:56.801126 70 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_cap_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0430 04:49:56.804425 70 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_label_tokens_cap' version 1
I0430 04:49:56.901261 70 model_repository_manager.cc:787] loading: jarvis_punctuation_merge_labels:1
I0430 04:49:56.901475 70 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_punct_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0430 04:49:56.901733 70 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_label_tokens_punct' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:49:57.002870 70 model_repository_manager.cc:787] loading: jarvis_tokenizer:1
I0430 04:49:57.002990 70 custom_backend.cc:198] Creating instance jarvis_punctuation_merge_labels_0_0_cpu on CPU using libtriton_jarvis_nlp_labels.so
I0430 04:49:57.008503 70 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation_merge_labels' version 1
I0430 04:49:57.103123 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0430 04:49:57.103322 70 custom_backend.cc:198] Creating instance jarvis_tokenizer_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0430 04:49:57.137261 70 model_repository_manager.cc:960] successfully loaded 'jarvis_tokenizer' version 1
I0430 04:49:57.203320 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0430 04:49:57.203431 70 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0430 04:49:57.303521 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0430 04:49:57.303746 70 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_asr_features.so
I0430 04:49:57.403719 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0430 04:49:57.403847 70 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0430 04:49:57.503913 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0430 04:49:57.504161 70 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_asr_features.so
I0430 04:49:57.604121 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0430 04:49:57.604277 70 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0430 04:49:57.622895 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline' version 1
I0430 04:49:57.704344 70 model_repository_manager.cc:787] loading: tacotron2_decoder_postnet:1
I0430 04:49:57.704491 70 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0430 04:49:57.706543 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming' version 1
I0430 04:49:57.804556 70 model_repository_manager.cc:787] loading: tts_preprocessor:1
I0430 04:49:57.904744 70 model_repository_manager.cc:787] loading: waveglow_denoiser:1
I0430 04:49:57.905001 70 custom_backend.cc:201] Creating instance tts_preprocessor_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_tts_preprocessor.so
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:49:58.005175 70 custom_backend.cc:201] Creating instance waveglow_denoiser_0_0_gpu0 on GPU 0 (8.6) using libtriton_jarvis_tts_denoiser.so
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:01.164671 70 tacotron-decoder-postnet.cc:873] TRITONBACKEND_ModelInitialize: tacotron2_decoder_postnet (version 1)
I0430 04:50:01.165557 70 model_repository_manager.cc:960] successfully loaded 'tts_preprocessor' version 1
I0430 04:50:01.165948 70 tacotron-decoder-postnet.cc:767] model configuration:
{
    "name": "tacotron2_decoder_postnet",
    "platform": "",
    "backend": "jarvis_tts_taco_postnet",
    "version_policy": {
        "latest": {
            "num_versions": 1
        }
    },
    "max_batch_size": 8,
    "input": [
        {
            "name": "input_decoder",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                1,
                400,
                512
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        },
        {
            "name": "input_processed_decoder",
            "data_type": "TYPE_FP32",
            "format": "FORMAT_NONE",
            "dims": [
                400,
                128,
                1,
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        },
        {
            "name": "input_num_characters",
            "data_type": "TYPE_INT32",
            "format": "FORMAT_NONE",
            "dims": [
                1
            ],
            "is_shape_tensor": false,
            "allow_ragged_batch": false
        }
    ],
    "output": [
        {
            "name": "spectrogram_chunk",
            "data_type": "TYPE_FP32",
            "dims": [
                1,
                80,
                80
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "z",
            "data_type": "TYPE_FP32",
            "dims": [
                8,
                2656,
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "num_valid_samples",
            "data_type": "TYPE_INT32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        },
        {
            "name": "end_flag",
            "data_type": "TYPE_INT32",
            "dims": [
                1
            ],
            "label_filename": "",
            "is_shape_tensor": false
        }
    ],
    "batch_input": [],
    "batch_output": [],
    "optimization": {
        "priority": "PRIORITY_DEFAULT",
        "input_pinned_memory": {
            "enable": true
        },
        "output_pinned_memory": {
            "enable": true
        }
    },
    "sequence_batching": {
        "oldest": {
            "max_candidate_sequences": 8,
            "preferred_batch_size": [
                8
            ],
            "max_queue_delay_microseconds": 100
        },
        "max_sequence_idle_microseconds": 60000000,
        "control_input": [
            {
                "name": "START",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_START",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "READY",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_READY",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "END",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_END",
                        "int32_false_true": [
                            0,
                            1
                        ],
                        "fp32_false_true": [],
                        "data_type": "TYPE_INVALID"
                    }
                ]
            },
            {
                "name": "CORRID",
                "control": [
                    {
                        "kind": "CONTROL_SEQUENCE_CORRID",
                        "int32_false_true": [],
                        "fp32_false_true": [],
                        "data_type": "TYPE_UINT64"
                    }
                ]
            }
        ]
    },
    "instance_group": [
        {
            "name": "tacotron2_decoder_postnet_0",
            "kind": "KIND_GPU",
            "count": 1,
            "gpus": [
                0
            ],
            "profile": []
        }
    ],
    "default_model_filename": "",
    "cc_model_filenames": {},
    "metric_tags": {},
    "parameters": {
        "num_samples_per_frame": {
            "string_value": "256"
        },
        "z_dim0": {
            "string_value": "8"
        },
        "tacotron_decoder_engine": {
            "string_value": "/data/models/tacotron2_decoder_postnet/1/model.plan"
        },
        "num_mels": {
            "string_value": "80"
        },
        "encoding_dimension": {
            "string_value": "512"
        },
        "z_dim1": {
            "string_value": "2656"
        },
        "max_execution_batch_size": {
            "string_value": "8"
        },
        "chunk_length": {
            "string_value": "80"
        },
        "max_input_length": {
            "string_value": "400"
        },
        "attention_dimension": {
            "string_value": "128"
        }
    },
    "model_warmup": [],
    "model_transaction_policy": {
        "decoupled": true
    }
}
I0430 04:50:01.166024 70 tacotron-decoder-postnet.cc:927] TRITONBACKEND_ModelInstanceInitialize: tacotron2_decoder_postnet_0 (device 0)
I0430 04:50:01.315696 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming' version 1
I0430 04:50:01.473259 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:08.100838 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline' version 1
I0430 04:50:08.100838 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming' version 1
I0430 04:50:08.172235 70 plan_backend.cc:338] Creating instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 (8.6) using model.plan
I0430 04:50:08.205693 70 plan_backend.cc:675] Created instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 with stream priority 0
I0430 04:50:08.210863 70 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-tacotron2_encoder' version 1
I0430 04:50:08.210916 70 model_repository_manager.cc:960] successfully loaded 'waveglow_denoiser' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:09.286023 70 model_repository_manager.cc:960] successfully loaded 'tacotron2_decoder_postnet' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:11.032784 70 plan_backend.cc:338] Creating instance jarvis-trt-jasper_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:13.855362 70 plan_backend.cc:671] Created instance jarvis-trt-jasper_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0430 04:50:13.877920 70 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jasper' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:14.816483 70 plan_backend.cc:338] Creating instance jarvis-trt-jarvis_punctuation-nn-bert-base-uncased_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:15.788078 70 plan_backend.cc:671] Created instance jarvis-trt-jarvis_punctuation-nn-bert-base-uncased_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0430 04:50:15.811425 70 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-jarvis_punctuation-nn-bert-base-uncased' version 1
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:16.636277 70 plan_backend.cc:338] Creating instance jarvis-trt-waveglow_0_0_gpu0 on GPU 0 (8.6) using model.plan
  > Jarvis waiting for Triton server to load all models...retrying in 1 second
I0430 04:50:17.455655 70 plan_backend.cc:671] Created instance jarvis-trt-waveglow_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0430 04:50:17.474904 70 model_repository_manager.cc:960] successfully loaded 'jarvis-trt-waveglow' version 1
I0430 04:50:17.475657 70 model_repository_manager.cc:787] loading: jarvis_punctuation:1
I0430 04:50:17.575782 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming:1
I0430 04:50:17.675885 70 model_repository_manager.cc:960] successfully loaded 'jarvis_punctuation' version 1
I0430 04:50:17.675924 70 model_repository_manager.cc:787] loading: jasper-asr-trt-ensemble-vad-streaming-offline:1
I0430 04:50:17.776006 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming' version 1
I0430 04:50:17.776035 70 model_repository_manager.cc:787] loading: tacotron2_ensemble:1
I0430 04:50:17.876148 70 model_repository_manager.cc:960] successfully loaded 'jasper-asr-trt-ensemble-vad-streaming-offline' version 1
I0430 04:50:17.876218 70 model_repository_manager.cc:960] successfully loaded 'tacotron2_ensemble' version 1
I0430 04:50:17.876286 70 server.cc:495] 
+-------------------------+-----------------------------------------------------------------------------------------+------+
| Backend                 | Config                                                                                  | Path |
+-------------------------+-----------------------------------------------------------------------------------------+------+
| onnxruntime             | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so                         | {}   |
| jarvis_tts_taco_postnet | /opt/tritonserver/backends/jarvis_tts_taco_postnet/libtriton_jarvis_tts_taco_postnet.so | {}   |
+-------------------------+-----------------------------------------------------------------------------------------+------+

I0430 04:50:17.876369 70 server.cc:538] 
+---------------------------------------------------------------------------------------------+---------+--------+
| Model                                                                                       | Version | Status |
+---------------------------------------------------------------------------------------------+---------+--------+
| jarvis-trt-jarvis_punctuation-nn-bert-base-uncased                                          | 1       | READY  |
| jarvis-trt-jasper                                                                           | 1       | READY  |
| jarvis-trt-tacotron2_encoder                                                                | 1       | READY  |
| jarvis-trt-waveglow                                                                         | 1       | READY  |
| jarvis_detokenize                                                                           | 1       | READY  |
| jarvis_punctuation                                                                          | 1       | READY  |
| jarvis_punctuation_gen_output                                                               | 1       | READY  |
| jarvis_punctuation_label_tokens_cap                                                         | 1       | READY  |
| jarvis_punctuation_label_tokens_punct                                                       | 1       | READY  |
| jarvis_punctuation_merge_labels                                                             | 1       | READY  |
| jarvis_tokenizer                                                                            | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming                                                       | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming                             | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming                           | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming-offline                                               | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline             | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline           | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline | 1       | READY  |
| jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming                 | 1       | READY  |
| tacotron2_decoder_postnet                                                                   | 1       | READY  |
| tacotron2_ensemble                                                                          | 1       | READY  |
| tts_preprocessor                                                                            | 1       | READY  |
| waveglow_denoiser                                                                           | 1       | READY  |
+---------------------------------------------------------------------------------------------+---------+--------+

I0430 04:50:17.876437 70 tritonserver.cc:1642] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                              |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                             |
| server_version                   | 2.7.0                                                                                                                                              |
| server_extensions                | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /data/models                                                                                                                                       |
| model_control_mode               | MODE_NONE                                                                                                                                          |
| strict_model_config              | 1                                                                                                                                                  |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                          |
| cuda_memory_pool_byte_size{0}    | 1000000000                                                                                                                                         |
| min_supported_compute_capability | 6.0                                                                                                                                                |
| strict_readiness                 | 1                                                                                                                                                  |
| exit_timeout                     | 30                                                                                                                                                 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+

I0430 04:50:17.886098 70 grpc_server.cc:3979] Started GRPCInferenceService at 0.0.0.0:8001
I0430 04:50:17.886787 70 http_server.cc:2717] Started HTTPService at 0.0.0.0:8000
I0430 04:50:17.954152 70 http_server.cc:2736] Started Metrics Service at 0.0.0.0:8002
  > Triton server is ready...
I0430 04:50:18.070966   246 grpc_health.cc:42] JarvisHealthService initialized with server: localhost:8001
I0430 04:50:18.071218   246 grpc_jarvis_asr.cc:130] Setting uri for ASRServiceImpl
I0430 04:50:18.071220   246 grpc_jarvis_asr.cc:131] Initializing different models
I0430 04:50:18.073035   246 model_registry.cc:52] JarvisModelRegistry initialized with server: localhost:8001
I0430 04:50:18.074175   246 model_registry.cc:81] Server Name: triton, Server version: 2.7.0
I0430 04:50:18.074358   246 model_registry.cc:102] Our model repository has a total of: 23 models
I0430 04:50:18.074362   246 model_registry.cc:107] Model names: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased, Model version: 1
I0430 04:50:18.076444   246 model_registry.cc:107] Model names: jarvis-trt-jasper, Model version: 1
I0430 04:50:18.076787   246 model_registry.cc:107] Model names: jarvis-trt-tacotron2_encoder, Model version: 1
I0430 04:50:18.077157   246 model_registry.cc:107] Model names: jarvis-trt-waveglow, Model version: 1
I0430 04:50:18.077656   246 model_registry.cc:107] Model names: jarvis_detokenize, Model version: 1
I0430 04:50:18.078009   246 model_registry.cc:107] Model names: jarvis_punctuation, Model version: 1
I0430 04:50:18.078497   246 model_registry.cc:107] Model names: jarvis_punctuation_gen_output, Model version: 1
I0430 04:50:18.078847   246 model_registry.cc:107] Model names: jarvis_punctuation_label_tokens_cap, Model version: 1
I0430 04:50:18.079159   246 model_registry.cc:107] Model names: jarvis_punctuation_label_tokens_punct, Model version: 1
I0430 04:50:18.079466   246 model_registry.cc:107] Model names: jarvis_punctuation_merge_labels, Model version: 1
I0430 04:50:18.079771   246 model_registry.cc:107] Model names: jarvis_tokenizer, Model version: 1
I0430 04:50:18.080121   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming, Model version: 1
I0430 04:50:18.080615   246 model_registry.cc:120] 'Successfully registering jasper-asr-trt-ensemble-vad-streaming'
I0430 04:50:18.080636   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming, Model version: 1
I0430 04:50:18.081128   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming, Model version: 1
I0430 04:50:18.081665   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline, Model version: 1
I0430 04:50:18.082165   246 model_registry.cc:120] 'Successfully registering jasper-asr-trt-ensemble-vad-streaming-offline'
I0430 04:50:18.082185   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline, Model version: 1
I0430 04:50:18.082682   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline, Model version: 1
I0430 04:50:18.083225   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline, Model version: 1
I0430 04:50:18.083628   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming, Model version: 1
I0430 04:50:18.084043   246 model_registry.cc:107] Model names: tacotron2_decoder_postnet, Model version: 1
I0430 04:50:18.084533   246 model_registry.cc:107] Model names: tacotron2_ensemble, Model version: 1
I0430 04:50:18.084961   246 model_registry.cc:107] Model names: tts_preprocessor, Model version: 1
I0430 04:50:18.085395   246 model_registry.cc:107] Model names: waveglow_denoiser, Model version: 1
I0430 04:50:18.085832   246 model_registry.cc:125] Successfully registered: 2 models.
I0430 04:50:18.085839   246 client.cc:54] JarvisNLPClient initialized with server: localhost:8001
I0430 04:50:18.086014   246 client.cc:70] Our model repository has: 23 models.
I0430 04:50:18.088192   246 client.cc:88] Registering 'jarvis_punctuation' with service '/nvidia.jarvis.nlp.JarvisCoreNLP/TransformText'
W0430 04:50:18.090278   246 client.cc:94] Registration of 'jasper-asr-trt-ensemble-vad-streaming' failed with unknown service type
W0430 04:50:18.091776   246 client.cc:94] Registration of 'jasper-asr-trt-ensemble-vad-streaming-offline' failed with unknown service type
W0430 04:50:18.094485   246 client.cc:94] Registration of 'tacotron2_ensemble' failed with unknown service type
I0430 04:50:18.095446   246 grpc_jarvis_asr.cc:153] Seeding RNG used for correlation id with time: 1619758218
I0430 04:50:18.095461   246 client.cc:54] JarvisNLPClient initialized with server: localhost:8001
I0430 04:50:18.095611   246 client.cc:70] Our model repository has: 23 models.
I0430 04:50:18.097781   246 client.cc:88] Registering 'jarvis_punctuation' with service '/nvidia.jarvis.nlp.JarvisCoreNLP/TransformText'
W0430 04:50:18.099866   246 client.cc:94] Registration of 'jasper-asr-trt-ensemble-vad-streaming' failed with unknown service type
W0430 04:50:18.101351   246 client.cc:94] Registration of 'jasper-asr-trt-ensemble-vad-streaming-offline' failed with unknown service type
W0430 04:50:18.104070   246 client.cc:94] Registration of 'tacotron2_ensemble' failed with unknown service type
I0430 04:50:18.104939   246 grpc_jarvis_nlp.cc:121] CoreNLPService GRPC service started
I0430 04:50:18.104945   246 client.cc:54] JarvisNLPClient initialized with server: localhost:8001
I0430 04:50:18.105115   246 client.cc:70] Our model repository has: 23 models.
I0430 04:50:18.107288   246 client.cc:88] Registering 'jarvis_punctuation' with service '/nvidia.jarvis.nlp.JarvisCoreNLP/TransformText'
W0430 04:50:18.109361   246 client.cc:94] Registration of 'jasper-asr-trt-ensemble-vad-streaming' failed with unknown service type
W0430 04:50:18.110868   246 client.cc:94] Registration of 'jasper-asr-trt-ensemble-vad-streaming-offline' failed with unknown service type
W0430 04:50:18.113569   246 client.cc:94] Registration of 'tacotron2_ensemble' failed with unknown service type
I0430 04:50:18.114446   246 model_registry.cc:52] JarvisModelRegistry initialized with server: localhost:8001
I0430 04:50:18.114584   246 model_registry.cc:81] Server Name: triton, Server version: 2.7.0
I0430 04:50:18.114729   246 model_registry.cc:102] Our model repository has a total of: 23 models
I0430 04:50:18.114732   246 model_registry.cc:107] Model names: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased, Model version: 1
I0430 04:50:18.115074   246 model_registry.cc:107] Model names: jarvis-trt-jasper, Model version: 1
I0430 04:50:18.115401   246 model_registry.cc:107] Model names: jarvis-trt-tacotron2_encoder, Model version: 1
I0430 04:50:18.115758   246 model_registry.cc:107] Model names: jarvis-trt-waveglow, Model version: 1
I0430 04:50:18.116137   246 model_registry.cc:107] Model names: jarvis_detokenize, Model version: 1
I0430 04:50:18.116478   246 model_registry.cc:107] Model names: jarvis_punctuation, Model version: 1
I0430 04:50:18.116930   246 model_registry.cc:120] 'Successfully registering jarvis_punctuation'
I0430 04:50:18.116951   246 model_registry.cc:107] Model names: jarvis_punctuation_gen_output, Model version: 1
I0430 04:50:18.117287   246 model_registry.cc:107] Model names: jarvis_punctuation_label_tokens_cap, Model version: 1
I0430 04:50:18.117599   246 model_registry.cc:107] Model names: jarvis_punctuation_label_tokens_punct, Model version: 1
I0430 04:50:18.117902   246 model_registry.cc:107] Model names: jarvis_punctuation_merge_labels, Model version: 1
I0430 04:50:18.118211   246 model_registry.cc:107] Model names: jarvis_tokenizer, Model version: 1
I0430 04:50:18.118567   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming, Model version: 1
I0430 04:50:18.119055   246 model_registry.cc:120] 'Successfully registering jasper-asr-trt-ensemble-vad-streaming'
I0430 04:50:18.119074   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming, Model version: 1
I0430 04:50:18.119570   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming, Model version: 1
I0430 04:50:18.120115   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline, Model version: 1
I0430 04:50:18.120597   246 model_registry.cc:120] 'Successfully registering jasper-asr-trt-ensemble-vad-streaming-offline'
I0430 04:50:18.120615   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline, Model version: 1
I0430 04:50:18.121104   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline, Model version: 1
I0430 04:50:18.121642   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline, Model version: 1
I0430 04:50:18.122045   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming, Model version: 1
I0430 04:50:18.122459   246 model_registry.cc:107] Model names: tacotron2_decoder_postnet, Model version: 1
I0430 04:50:18.122956   246 model_registry.cc:107] Model names: tacotron2_ensemble, Model version: 1
I0430 04:50:18.123376   246 model_registry.cc:120] 'Successfully registering tacotron2_ensemble'
I0430 04:50:18.123391   246 model_registry.cc:107] Model names: tts_preprocessor, Model version: 1
I0430 04:50:18.123827   246 model_registry.cc:107] Model names: waveglow_denoiser, Model version: 1
I0430 04:50:18.124269   246 model_registry.cc:125] Successfully registered: 4 models.
I0430 04:50:18.124275   246 grpc_jarvis_nlp.cc:130] NLPService GRPC service started
I0430 04:50:18.124279   246 grpc_jarvis_tts.cc:57] Setting uri for TTSServiceImpl
I0430 04:50:18.124279   246 grpc_jarvis_tts.cc:58] Initializing models
I0430 04:50:18.124282   246 model_registry.cc:52] JarvisModelRegistry initialized with server: localhost:8001
I0430 04:50:18.124416   246 model_registry.cc:81] Server Name: triton, Server version: 2.7.0
I0430 04:50:18.124562   246 model_registry.cc:102] Our model repository has a total of: 23 models
I0430 04:50:18.124565   246 model_registry.cc:107] Model names: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased, Model version: 1
I0430 04:50:18.124908   246 model_registry.cc:107] Model names: jarvis-trt-jasper, Model version: 1
I0430 04:50:18.125236   246 model_registry.cc:107] Model names: jarvis-trt-tacotron2_encoder, Model version: 1
I0430 04:50:18.125592   246 model_registry.cc:107] Model names: jarvis-trt-waveglow, Model version: 1
I0430 04:50:18.125972   246 model_registry.cc:107] Model names: jarvis_detokenize, Model version: 1
I0430 04:50:18.126324   246 model_registry.cc:107] Model names: jarvis_punctuation, Model version: 1
I0430 04:50:18.126787   246 model_registry.cc:107] Model names: jarvis_punctuation_gen_output, Model version: 1
I0430 04:50:18.127125   246 model_registry.cc:107] Model names: jarvis_punctuation_label_tokens_cap, Model version: 1
I0430 04:50:18.127434   246 model_registry.cc:107] Model names: jarvis_punctuation_label_tokens_punct, Model version: 1
I0430 04:50:18.127738   246 model_registry.cc:107] Model names: jarvis_punctuation_merge_labels, Model version: 1
I0430 04:50:18.128042   246 model_registry.cc:107] Model names: jarvis_tokenizer, Model version: 1
I0430 04:50:18.128391   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming, Model version: 1
I0430 04:50:18.128878   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming, Model version: 1
I0430 04:50:18.129365   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming, Model version: 1
I0430 04:50:18.129901   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline, Model version: 1
I0430 04:50:18.130393   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline, Model version: 1
I0430 04:50:18.130882   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline, Model version: 1
I0430 04:50:18.131417   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline, Model version: 1
I0430 04:50:18.131819   246 model_registry.cc:107] Model names: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming, Model version: 1
I0430 04:50:18.132230   246 model_registry.cc:107] Model names: tacotron2_decoder_postnet, Model version: 1
I0430 04:50:18.132726   246 model_registry.cc:107] Model names: tacotron2_ensemble, Model version: 1
I0430 04:50:18.133149   246 model_registry.cc:120] 'Successfully registering tacotron2_ensemble'
I0430 04:50:18.133165   246 model_registry.cc:107] Model names: tts_preprocessor, Model version: 1
I0430 04:50:18.133599   246 model_registry.cc:107] Model names: waveglow_denoiser, Model version: 1
I0430 04:50:18.134037   246 model_registry.cc:125] Successfully registered: 1 models.
I0430 04:50:18.134043   246 grpc_jarvis_tts.cc:68] Seeding RNG used for correlation id with time: 1619758218
I0430 04:50:18.134156   246 jarvis_server.cc:68] NLP Service connected to Triton at localhost:8001
I0430 04:50:18.134161   246 jarvis_server.cc:70] ASR Service connected to Triton at localhost:8001
I0430 04:50:18.134162   246 jarvis_server.cc:72] TTS Service connected to Triton at localhost:8001
I0430 04:50:18.134164   246 jarvis_server.cc:73] Jarvis Conversational AI Server listening on 0.0.0.0:50051

Hi @sonle,
Please refer to GPU memory required per model basis in below support matrix:
https://docs.nvidia.com/deeplearning/jarvis/user-guide/docs/support-matrix.html

Also there might be some additional memory requirement during TRT engine generation. So based on your GPU memory you can try out different models at a time to track exact memory consumption in your case.

Thanks