Jarvis: Triton server died before reaching ready state. Terminating Jarvis startup

Hello,
I just wanna start the jarvis server with jarvis_init.sh

and then jarvis_start.sh. When running jarvis_start.sh it fails with the message:

Health ready check failed

I tried the way it where explained the other topic:

So I modified the config.sh in a way, that only the punctation model from nlp is loaded
But I still get following error in the docker logs jarvis-speech

==========================
== Jarvis Speech Skills ==

NVIDIA Release 20.11 (build 19933361)

Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 …

Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:51:10.504064 73 metrics.cc:219] Collecting metrics for GPU 0: GeForce RTX 2070 SUPER
I0504 13:51:10.637832 73 pinned_memory_manager.cc:199] Pinned memory pool is created at ‘0x7fa6ca000000’ with size 268435456
I0504 13:51:10.638011 73 cuda_memory_manager.cc:99] CUDA memory pool is created on device 0 with size 1000000000
E0504 13:51:10.746951 73 model_repository_manager.cc:1705] failed to open text file for read /data/models/jarvis_label_tokens_weather/config.pbtxt: No such file or directory
E0504 13:51:10.836671 73 model_repository_manager.cc:1705] failed to open text file for read /data/models/tacotron2_decoder_postnet/config.pbtxt: No such file or directory
E0504 13:51:10.849634 73 model_repository_manager.cc:1183] Invalid argument: ensemble jarvis_intent_weather contains models that are not available: jarvis_label_tokens_weather
E0504 13:51:10.849684 73 model_repository_manager.cc:1183] Invalid argument: ensemble tacotron2_ensemble contains models that are not available: tacotron2_decoder_postnet
I0504 13:51:10.849833 73 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_ner-nn-bert-base-uncased:1
I0504 13:51:10.850161 73 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased:1
I0504 13:51:10.850485 73 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_punctuation-nn-bert-base-uncased:1
I0504 13:51:10.850805 73 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_qa-nn-bert-base-uncased:1
I0504 13:51:10.851341 73 model_repository_manager.cc:810] loading: jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased:1
I0504 13:51:10.851689 73 model_repository_manager.cc:810] loading: jarvis-trt-jasper:1
I0504 13:51:10.852130 73 model_repository_manager.cc:810] loading: jarvis-trt-tacotron2_encoder:1
I0504 13:51:10.852620 73 model_repository_manager.cc:810] loading: jarvis-trt-waveglow:1
I0504 13:51:10.853065 73 model_repository_manager.cc:810] loading: jarvis_detokenize:1
I0504 13:51:10.853659 73 model_repository_manager.cc:810] loading: jarvis_ner_label_tokens:1
I0504 13:51:10.854007 73 model_repository_manager.cc:810] loading: jarvis_punctuation_gen_output:1
I0504 13:51:10.854097 73 custom_backend.cc:198] Creating instance jarvis_detokenize_0_0_cpu on CPU using libtriton_jarvis_nlp_detokenizer.so
I0504 13:51:10.854119 73 model_repository_manager.cc:810] loading: jarvis_punctuation_label_tokens_cap:1
I0504 13:51:10.854211 73 model_repository_manager.cc:810] loading: jarvis_punctuation_label_tokens_punct:1
I0504 13:51:10.854341 73 custom_backend.cc:198] Creating instance jarvis_punctuation_gen_output_0_0_cpu on CPU using libtriton_jarvis_nlp_punctuation.so
I0504 13:51:10.854366 73 model_repository_manager.cc:810] loading: jarvis_punctuation_merge_labels:1
I0504 13:51:10.854478 73 model_repository_manager.cc:810] loading: jarvis_qa_postprocessor:1
I0504 13:51:10.854530 73 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_cap_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0504 13:51:10.854566 73 custom_backend.cc:198] Creating instance jarvis_ner_label_tokens_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0504 13:51:10.854611 73 model_repository_manager.cc:810] loading: jarvis_qa_preprocessor:1
I0504 13:51:10.854703 73 model_repository_manager.cc:810] loading: jarvis_tokenizer:1
I0504 13:51:10.854789 73 custom_backend.cc:198] Creating instance jarvis_punctuation_merge_labels_0_0_cpu on CPU using libtriton_jarvis_nlp_labels.so
I0504 13:51:10.854812 73 custom_backend.cc:198] Creating instance jarvis_punctuation_label_tokens_punct_0_0_cpu on CPU using libtriton_jarvis_nlp_seqlabel.so
I0504 13:51:10.854865 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0504 13:51:10.854960 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0504 13:51:10.855022 73 custom_backend.cc:198] Creating instance jarvis_qa_postprocessor_0_0_cpu on CPU using libtriton_jarvis_nlp_qa.so
I0504 13:51:10.855042 73 custom_backend.cc:198] Creating instance jarvis_qa_preprocessor_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0504 13:51:10.855086 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0504 13:51:10.855185 73 custom_backend.cc:198] Creating instance jarvis_tokenizer_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0504 13:51:10.855237 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0504 13:51:10.855360 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0504 13:51:10.855369 73 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0504 13:51:10.855516 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0504 13:51:10.855623 73 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0504 13:51:10.855633 73 model_repository_manager.cc:810] loading: tts_preprocessor:1
I0504 13:51:10.855785 73 model_repository_manager.cc:810] loading: waveglow_denoiser:1
I0504 13:51:10.855849 73 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_asr_features.so
I0504 13:51:10.856135 73 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0504 13:51:10.856242 73 custom_backend.cc:198] Creating instance jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0504 13:51:10.856354 73 custom_backend.cc:201] Creating instance jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_asr_features.so
I0504 13:51:10.856625 73 custom_backend.cc:201] Creating instance tts_preprocessor_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_tts_preprocessor.so
I0504 13:51:10.856659 73 custom_backend.cc:201] Creating instance waveglow_denoiser_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_tts_denoiser.so
I0504 13:51:10.880984 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_detokenize’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:51:11.813319 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_ner_label_tokens’ version 1
I0504 13:51:12.198163 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_punctuation_merge_labels’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:51:12.499487 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_punctuation_label_tokens_punct’ version 1
I0504 13:51:12.821327 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_qa_postprocessor’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:51:14.851041 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_qa_preprocessor’ version 1
I0504 13:51:16.064743 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline’ version 1
I0504 13:51:16.064748 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming’ version 1
I0504 13:51:16.413629 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_punctuation_gen_output’ version 1
I0504 13:51:16.612826 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_punctuation_label_tokens_cap’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:51:17.914077 73 model_repository_manager.cc:983] successfully loaded ‘tts_preprocessor’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:51:18.464281 73 model_repository_manager.cc:983] successfully loaded ‘jarvis_tokenizer’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:51:59.502117 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming’ version 1
I0504 13:51:59.502181 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:03.986163 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:09.140760 73 plan_backend.cc:333] Creating instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 (7.5) using model.plan
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:15.368770 73 plan_backend.cc:670] Created instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 with stream priority 0
I0504 13:52:15.385669 73 model_repository_manager.cc:983] successfully loaded ‘jarvis-trt-tacotron2_encoder’ version 1
I0504 13:52:15.385762 73 model_repository_manager.cc:983] successfully loaded ‘waveglow_denoiser’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:16.099096 73 plan_backend.cc:333] Creating instance jarvis-trt-waveglow_0_0_gpu0 on GPU 0 (7.5) using model.plan
I0504 13:52:16.531296 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
E0504 13:52:16.801603 73 logging.cc:43] …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
E0504 13:52:16.810415 73 logging.cc:43] FAILED_ALLOCATION: std::exception
E0504 13:52:16.848400 73 model_repository_manager.cc:986] failed to load ‘jarvis-trt-waveglow’ version 1: Internal: unable to create TensorRT context
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:19.241385 73 plan_backend.cc:333] Creating instance jarvis-trt-jasper_0_0_gpu0 on GPU 0 (7.5) using model.plan
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:21.663198 73 plan_backend.cc:666] Created instance jarvis-trt-jasper_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0504 13:52:21.684487 73 model_repository_manager.cc:983] successfully loaded ‘jarvis-trt-jasper’ version 1
Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:22.673479 73 plan_backend.cc:333] Creating instance jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased_0_0_gpu0 on GPU 0 (7.5) using model.plan
E0504 13:52:23.474276 73 logging.cc:43] …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
E0504 13:52:23.474305 73 logging.cc:43] INTERNAL_ERROR: std::exception
E0504 13:52:23.497639 73 logging.cc:43] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/plugin/common/bertCommon.h (383) - Cuda Error in copyToDevice: 2 (out of memory)
E0504 13:52:23.499575 73 logging.cc:43] FAILED_ALLOCATION: std::exception
E0504 13:52:23.527953 73 model_repository_manager.cc:986] failed to load ‘jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased’ version 1: Internal: unable to create TensorRT context
Jarvis waiting for Triton server to load all models…retrying in 1 second
E0504 13:52:24.211975 73 logging.cc:43] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/plugin/common/bertCommon.h (383) - Cuda Error in copyToDevice: 2 (out of memory)
E0504 13:52:24.212034 73 logging.cc:43] INVALID_STATE: std::exception
E0504 13:52:24.212083 73 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0504 13:52:24.235810 73 model_repository_manager.cc:986] failed to load ‘jarvis-trt-jarvis_qa-nn-bert-base-uncased’ version 1: Internal: unable to create TensorRT engine
Jarvis waiting for Triton server to load all models…retrying in 1 second
E0504 13:52:24.882503 73 logging.cc:43] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/plugin/common/bertCommon.h (383) - Cuda Error in copyToDevice: 2 (out of memory)
E0504 13:52:24.886233 73 logging.cc:43] INVALID_STATE: std::exception
E0504 13:52:24.886271 73 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0504 13:52:24.909164 73 model_repository_manager.cc:986] failed to load ‘jarvis-trt-jarvis_ner-nn-bert-base-uncased’ version 1: Internal: unable to create TensorRT engine
E0504 13:52:25.549697 73 logging.cc:43] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/plugin/common/bertCommon.h (383) - Cuda Error in copyToDevice: 2 (out of memory)
E0504 13:52:25.553698 73 logging.cc:43] INVALID_STATE: std::exception
E0504 13:52:25.553730 73 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0504 13:52:25.576133 73 model_repository_manager.cc:986] failed to load ‘jarvis-trt-jarvis_punctuation-nn-bert-base-uncased’ version 1: Internal: unable to create TensorRT engine
Jarvis waiting for Triton server to load all models…retrying in 1 second
E0504 13:52:26.221905 73 logging.cc:43] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/plugin/common/bertCommon.h (383) - Cuda Error in copyToDevice: 2 (out of memory)
E0504 13:52:26.226081 73 logging.cc:43] INVALID_STATE: std::exception
E0504 13:52:26.226131 73 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0504 13:52:26.248880 73 model_repository_manager.cc:986] failed to load ‘jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased’ version 1: Internal: unable to create TensorRT engine
E0504 13:52:26.264011 73 model_repository_manager.cc:1183] Invalid argument: ensemble ‘jarvis_ner’ depends on ‘jarvis-trt-jarvis_ner-nn-bert-base-uncased’ which has no loaded version
E0504 13:52:26.264025 73 model_repository_manager.cc:1183] Invalid argument: ensemble ‘jarvis_punctuation’ depends on ‘jarvis-trt-jarvis_punctuation-nn-bert-base-uncased’ which has no loaded version
E0504 13:52:26.264028 73 model_repository_manager.cc:1183] Invalid argument: ensemble ‘jarvis_qa’ depends on ‘jarvis-trt-jarvis_qa-nn-bert-base-uncased’ which has no loaded version
E0504 13:52:26.264031 73 model_repository_manager.cc:1183] Invalid argument: ensemble ‘jarvis_text_classification_domain’ depends on ‘jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased’ which has no loaded version
I0504 13:52:26.264311 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming:1
I0504 13:52:26.264378 73 model_repository_manager.cc:810] loading: jasper-asr-trt-ensemble-vad-streaming-offline:1
I0504 13:52:26.264502 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming’ version 1
I0504 13:52:26.264568 73 model_repository_manager.cc:983] successfully loaded ‘jasper-asr-trt-ensemble-vad-streaming-offline’ version 1
I0504 13:52:26.264619 73 server.cc:141]
±--------±-------±-----+
| Backend | Config | Path |
±--------±-------±-----+
±--------±-------±-----+

I0504 13:52:26.264756 73 server.cc:184]
±--------------------------------------------------------------------------------------------±--------±---------------------------------------------------------+
| Model | Version | Status |
±--------------------------------------------------------------------------------------------±--------±---------------------------------------------------------+
| jarvis-trt-jarvis_intent_weather-nn-bert-base-uncased | 1 | UNAVAILABLE: Internal: unable to create TensorRT engine |
| jarvis-trt-jarvis_ner-nn-bert-base-uncased | 1 | UNAVAILABLE: Internal: unable to create TensorRT engine |
| jarvis-trt-jarvis_punctuation-nn-bert-base-uncased | 1 | UNAVAILABLE: Internal: unable to create TensorRT engine |
| jarvis-trt-jarvis_qa-nn-bert-base-uncased | 1 | UNAVAILABLE: Internal: unable to create TensorRT engine |
| jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased | 1 | UNAVAILABLE: Internal: unable to create TensorRT context |
| jarvis-trt-jasper | 1 | READY |
| jarvis-trt-tacotron2_encoder | 1 | READY |
| jarvis-trt-waveglow | 1 | UNAVAILABLE: Internal: unable to create TensorRT context |
| jarvis_detokenize | 1 | READY |
| jarvis_intent_weather | - | Not loaded: No model version was found |
| jarvis_ner | - | Not loaded: No model version was found |
| jarvis_ner_label_tokens | 1 | READY |
| jarvis_punctuation | - | Not loaded: No model version was found |
| jarvis_punctuation_gen_output | 1 | READY |
| jarvis_punctuation_label_tokens_cap | 1 | READY |
| jarvis_punctuation_label_tokens_punct | 1 | READY |
| jarvis_punctuation_merge_labels | 1 | READY |
| jarvis_qa | - | Not loaded: No model version was found |
| jarvis_qa_postprocessor | 1 | READY |
| jarvis_qa_preprocessor | 1 | READY |
| jarvis_text_classification_domain | - | Not loaded: No model version was found |
| jarvis_tokenizer | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming-offline | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline | 1 | READY |
| jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming | 1 | READY |
| tacotron2_ensemble | - | Not loaded: No model version was found |
| tts_preprocessor | 1 | READY |
| waveglow_denoiser | 1 | READY |
±--------------------------------------------------------------------------------------------±--------±---------------------------------------------------------+

I0504 13:52:26.274751 73 tritonserver.cc:1620]
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.5.0 |
| server_extensions | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0] | /data/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 1000000000 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------+

I0504 13:52:26.274758 73 server.cc:280] Waiting for in-flight requests to complete.
I0504 13:52:26.274762 73 model_repository_manager.cc:837] unloading: tts_preprocessor:1
I0504 13:52:26.274783 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0504 13:52:26.274821 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0504 13:52:26.274975 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming:1
I0504 13:52:26.275443 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0504 13:52:26.275489 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming’ version 1
I0504 13:52:26.275503 73 model_repository_manager.cc:837] unloading: jarvis_qa_postprocessor:1
I0504 13:52:26.275526 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0504 13:52:26.275572 73 model_repository_manager.cc:837] unloading: jarvis_punctuation_merge_labels:1
I0504 13:52:26.275817 73 model_repository_manager.cc:837] unloading: jarvis_tokenizer:1
I0504 13:52:26.275858 73 model_repository_manager.cc:837] unloading: jarvis-trt-jasper:1
I0504 13:52:26.275916 73 model_repository_manager.cc:837] unloading: waveglow_denoiser:1
I0504 13:52:26.275952 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming-offline:1
I0504 13:52:26.275991 73 model_repository_manager.cc:837] unloading: jarvis_detokenize:1
I0504 13:52:26.276028 73 model_repository_manager.cc:837] unloading: jarvis_qa_preprocessor:1
I0504 13:52:26.276062 73 model_repository_manager.cc:837] unloading: jarvis-trt-tacotron2_encoder:1
I0504 13:52:26.276099 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0504 13:52:26.276148 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming-offline’ version 1
I0504 13:52:26.277138 73 model_repository_manager.cc:837] unloading: jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0504 13:52:26.294588 73 model_repository_manager.cc:837] unloading: jarvis_ner_label_tokens:1
I0504 13:52:26.294847 73 model_repository_manager.cc:837] unloading: jarvis_punctuation_gen_output:1
I0504 13:52:26.294953 73 model_repository_manager.cc:837] unloading: jarvis_punctuation_label_tokens_cap:1
I0504 13:52:26.294991 73 model_repository_manager.cc:837] unloading: jarvis_punctuation_label_tokens_punct:1
I0504 13:52:26.295045 73 server.cc:295] Timeout 30: Found 19 live models and 0 in-flight non-inference requests
I0504 13:52:26.319848 73 model_repository_manager.cc:966] successfully unloaded ‘tts_preprocessor’ version 1
I0504 13:52:26.321158 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_qa_postprocessor’ version 1
I0504 13:52:26.321393 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_punctuation_merge_labels’ version 1
I0504 13:52:26.321426 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_detokenize’ version 1
I0504 13:52:26.321476 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming’ version 1
I0504 13:52:26.321610 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline’ version 1
I0504 13:52:26.321615 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_qa_preprocessor’ version 1
I0504 13:52:26.321648 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_ner_label_tokens’ version 1
I0504 13:52:26.321771 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_punctuation_gen_output’ version 1
I0504 13:52:26.321806 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_punctuation_label_tokens_cap’ version 1
I0504 13:52:26.321948 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_punctuation_label_tokens_punct’ version 1
I0504 13:52:26.321982 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline’ version 1
I0504 13:52:26.322467 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis_tokenizer’ version 1
I0504 13:52:26.325468 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline’ version 1
I0504 13:52:26.389993 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming-feature-extractor-streaming’ version 1
I0504 13:52:26.400994 73 model_repository_manager.cc:966] successfully unloaded ‘jasper-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming’ version 1
I0504 13:52:26.416035 73 model_repository_manager.cc:966] successfully unloaded ‘waveglow_denoiser’ version 1
I0504 13:52:26.427480 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis-trt-jasper’ version 1
I0504 13:52:26.428065 73 model_repository_manager.cc:966] successfully unloaded ‘jarvis-trt-tacotron2_encoder’ version 1

Jarvis waiting for Triton server to load all models…retrying in 1 second
I0504 13:52:27.295211 73 server.cc:295] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Jarvis waiting for Triton server to load all models…retrying in 1 second
Triton server died before reaching ready state. Terminating Jarvis startup.
Check Triton logs with: docker logs
/opt/jarvis/bin/start-jarvis: line 1: kill: (73) - No such process

Hi @benjamin.vollmers
Could you please share your updated config file so we can help better?

Thanks

Here is the Config.sh @SunilJB

Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.

NVIDIA CORPORATION and its licensors retain all intellectual property

and proprietary rights in and to this software, related documentation

and any modifications thereto. Any use, reproduction, disclosure or

distribution of this software and related documentation without an express

license agreement from NVIDIA CORPORATION is strictly prohibited.

Enable or Disable Jarvis Services

service_enabled_asr=false
service_enabled_nlp=true
service_enabled_tts=false

Specify one or more GPUs to use

specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.

gpus_to_use=“device=0”

Specify the encryption key to use to deploy models

MODEL_DEPLOY_KEY=“tlt_encode”

Locations to use for storing models artifacts

If an absolute path is specified, the data will be written to that location

Otherwise, a docker volume will be used (default).

jarvis_init.sh will create a jmir and models directory in the volume or

path specified.

JMIR ($jarvis_model_loc/jmir)

Jarvis uses an intermediate representation (JMIR) for models

that are ready to deploy but not yet fully optimized for deployment. Pretrained

versions can be obtained from NGC (by specifying NGC models below) and will be

downloaded to $jarvis_model_loc/jmir by jarvis_init.sh

Custom models produced by NeMo or TLT and prepared using jarvis-build

may also be copied manually to this location $(jarvis_model_loc/jmir).

Models ($jarvis_model_loc/models)

During the jarvis_init process, the JMIR files in $jarvis_model_loc/jmir

are inspected and optimized for deployment. The optimized versions are

stored in $jarvis_model_loc/models. The jarvis server exclusively uses these

optimized versions.

jarvis_model_loc=“jarvis-model-repo”

The default JMIRs are downloaded from NGC by default in the above $jarvis_jmir_loc directory

If you’d like to skip the download from NGC and use the existing JMIRs in the $jarvis_jmir_loc

then set the below $use_existing_jmirs flag to true. You can also deploy your set of custom

JMIRs by keeping them in the jarvis_jmir_loc dir and use this quickstart script with the

below flag to deploy them all together.

use_existing_jmirs=false

Ports to expose for Jarvis services

jarvis_speech_api_port=“50051”
jarvis_vision_api_port=“60051”

NGC orgs

jarvis_ngc_org=“nvidia”
jarvis_ngc_team=“jarvis”
jarvis_ngc_image_version=“1.0.0-b.1”
jarvis_ngc_model_version=“1.0.0-b.1”

Pre-built models listed below will be downloaded from NGC. If models already exist in $jarvis-jmir

then models can be commented out to skip download from NGC

models_asr=(

Punctuation model

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_punctuation:${jarvis_ngc_model_version}”

Jasper Streaming w/ CPU decoder, best latency configuration

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming:${jarvis_ngc_model_version}”

Jasper Streaming w/ CPU decoder, best throughput configuration

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_throughput:${jarvis_ngc_model_version}”

Jasper Offline w/ CPU decoder

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_offline:${jarvis_ngc_model_version}”

Quarztnet Streaming w/ CPU decoder, best latency configuration

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_streaming:${jarvis_ngc_model_version}”

Quarztnet Streaming w/ CPU decoder, best throughput configuration

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_streaming_throughput:${jarvis_ngc_model_version}”

Quarztnet Offline w/ CPU decoder

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_offline:${jarvis_ngc_model_version}”

Jasper Streaming w/ GPU decoder, best latency configuration

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_gpu_decoder:${jarvis_ngc_model_version}”

Jasper Streaming w/ GPU decoder, best throughput configuration

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_throughput_gpu_decoder:${jarvis_ngc_model_version}”

Jasper Offline w/ GPU decoder

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_offline_gpu_decoder:${jarvis_ngc_model_version}”

)

models_nlp=(
“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_punctuation:{jarvis_ngc_model_version}" # "{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_named_entity_recognition:{jarvis_ngc_model_version}”
# “{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_intent_slot:{jarvis_ngc_model_version}" # "{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_question_answering:{jarvis_ngc_model_version}”
# “{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_text_classification:${jarvis_ngc_model_version}”
)
models_tts=(

“{jarvis_ngc_org}/{jarvis_ngc_team}/jmir_jarvis_tts_ljspeech:${jarvis_ngc_model_version}”

)

NGC_TARGET={jarvis_ngc_org} if [[ ! -z {jarvis_ngc_team} ]]; then
NGC_TARGET="{NGC_TARGET}/{jarvis_ngc_team}"
else
team=""""
fi

define docker images required to run Jarvis

image_client=“nvcr.io/{NGC_TARGET}/jarvis-speech-client:{jarvis_ngc_image_version}”
image_speech_api=“nvcr.io/{NGC_TARGET}/jarvis-speech:{jarvis_ngc_image_version}-server”

define docker images required to setup Jarvis

image_init_speech=“nvcr.io/{NGC_TARGET}/jarvis-speech:{jarvis_ngc_image_version}-servicemaker”

daemon names

jarvis_daemon_speech=“jarvis-speech”
jarvis_daemon_client=“jarvis-client”

Thanks for starting this thread, I had the same issue with a similarly spec’d machine, but with the default config

System:
Ubuntu 20.04
i7 6700k
32GB DDR4
RTX 4000

1 Like

Hi @benjamin.vollmers
It seems Jarvis model repository (stored in Docker volume jarvis-model-repo ) still has all the models so Triton tries to load them.
I would recommend to remove the Docker volume with docker volume rm jarvis-model-repo and then rerun jarvis_init.sh and jarvis_start.sh with the config.sh (with only the punctuator model enabled)

Thanks

@SunilJB Yes it worked. Thanks.
There are now some more questions:
I have to run jarvis_start_client and then I work as the client. But how can I stop the client?

Then there are two errors I get when trying the demo Python scripts from the documentation:

At first here is my current config.sh

# Copyright (c) 2021, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.

# Enable or Disable Jarvis Services
service_enabled_asr=false
service_enabled_nlp=true
service_enabled_tts=true

# Specify one or more GPUs to use
# specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.
gpus_to_use="device=0"

# Specify the encryption key to use to deploy models
MODEL_DEPLOY_KEY="tlt_encode"

# Locations to use for storing models artifacts
#
# If an absolute path is specified, the data will be written to that location
# Otherwise, a docker volume will be used (default).
#
# jarvis_init.sh will create a `jmir` and `models` directory in the volume or
# path specified. 
#
# JMIR ($jarvis_model_loc/jmir)
# Jarvis uses an intermediate representation (JMIR) for models
# that are ready to deploy but not yet fully optimized for deployment. Pretrained
# versions can be obtained from NGC (by specifying NGC models below) and will be
# downloaded to $jarvis_model_loc/jmir by `jarvis_init.sh`
# 
# Custom models produced by NeMo or TLT and prepared using jarvis-build
# may also be copied manually to this location $(jarvis_model_loc/jmir).
#
# Models ($jarvis_model_loc/models)
# During the jarvis_init process, the JMIR files in $jarvis_model_loc/jmir
# are inspected and optimized for deployment. The optimized versions are
# stored in $jarvis_model_loc/models. The jarvis server exclusively uses these
# optimized versions.
jarvis_model_loc="jarvis-model-repo"

# The default JMIRs are downloaded from NGC by default in the above $jarvis_jmir_loc directory
# If you'd like to skip the download from NGC and use the existing JMIRs in the $jarvis_jmir_loc
# then set the below $use_existing_jmirs flag to true. You can also deploy your set of custom
# JMIRs by keeping them in the jarvis_jmir_loc dir and use this quickstart script with the
# below flag to deploy them all together.
use_existing_jmirs=false

# Ports to expose for Jarvis services
jarvis_speech_api_port="50051"
jarvis_vision_api_port="60051"

# NGC orgs
jarvis_ngc_org="nvidia"
jarvis_ngc_team="jarvis"
jarvis_ngc_image_version="1.0.0-b.1"
jarvis_ngc_model_version="1.0.0-b.1"

# Pre-built models listed below will be downloaded from NGC. If models already exist in $jarvis-jmir
# then models can be commented out to skip download from NGC

models_asr=(
### Punctuation model
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_punctuation:${jarvis_ngc_model_version}"

### Jasper Streaming w/ CPU decoder, best latency configuration
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming:${jarvis_ngc_model_version}"

### Jasper Streaming w/ CPU decoder, best throughput configuration
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_throughput:${jarvis_ngc_model_version}"

###  Jasper Offline w/ CPU decoder
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_offline:${jarvis_ngc_model_version}"
 
### Quarztnet Streaming w/ CPU decoder, best latency configuration
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_streaming:${jarvis_ngc_model_version}"

### Quarztnet Streaming w/ CPU decoder, best throughput configuration
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_streaming_throughput:${jarvis_ngc_model_version}"

### Quarztnet Offline w/ CPU decoder
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_quartznet_english_offline:${jarvis_ngc_model_version}"

### Jasper Streaming w/ GPU decoder, best latency configuration
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_gpu_decoder:${jarvis_ngc_model_version}"

### Jasper Streaming w/ GPU decoder, best throughput configuration
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_streaming_throughput_gpu_decoder:${jarvis_ngc_model_version}"

### Jasper Offline w/ GPU decoder
#    "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_asr_jasper_english_offline_gpu_decoder:${jarvis_ngc_model_version}"
)

models_nlp=(
    #"${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_punctuation:${jarvis_ngc_model_version}"
     "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_named_entity_recognition:${jarvis_ngc_model_version}"
     "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_intent_slot:${jarvis_ngc_model_version}"
     "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_question_answering:${jarvis_ngc_model_version}"
     "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_text_classification:${jarvis_ngc_model_version}"
)
models_tts=(
   "${jarvis_ngc_org}/${jarvis_ngc_team}/jmir_jarvis_tts_ljspeech:${jarvis_ngc_model_version}"
)

NGC_TARGET=${jarvis_ngc_org}
if [[ ! -z ${jarvis_ngc_team} ]]; then
  NGC_TARGET="${NGC_TARGET}/${jarvis_ngc_team}"
else
  team="\"\""
fi

# define docker images required to run Jarvis
image_client="nvcr.io/${NGC_TARGET}/jarvis-speech-client:${jarvis_ngc_image_version}"
image_speech_api="nvcr.io/${NGC_TARGET}/jarvis-speech:${jarvis_ngc_image_version}-server"

# define docker images required to setup Jarvis
image_init_speech="nvcr.io/${NGC_TARGET}/jarvis-speech:${jarvis_ngc_image_version}-servicemaker"

# daemon names
jarvis_daemon_speech="jarvis-speech"
jarvis_daemon_client="jarvis-client"

And here is the part of the demo scripts I use for TTS:

import io
import librosa
from time import time
import numpy as np
import IPython.display as ipd
import grpc
import requests

# NLP proto
import jarvis_api.jarvis_nlp_core_pb2 as jcnlp
import jarvis_api.jarvis_nlp_core_pb2_grpc as jcnlp_srv
import jarvis_api.jarvis_nlp_pb2 as jnlp
import jarvis_api.jarvis_nlp_pb2_grpc as jnlp_srv

# TTS proto
import jarvis_api.jarvis_tts_pb2 as jtts
import jarvis_api.jarvis_tts_pb2_grpc as jtts_srv
import jarvis_api.audio_pb2 as ja

channel = grpc.insecure_channel('localhost:50051')


jarvis_nlp = jnlp_srv.JarvisNLPStub(channel)
jarvis_cnlp = jcnlp_srv.JarvisCoreNLPStub(channel)
jarvis_tts = jtts_srv.JarvisTTSStub(channel)
req = jtts.SynthesizeSpeechRequest()

req.text = "Is it recognize speech or wreck a nice beach?"
req.language_code = "en-US"                    # currently required to be "en-US"
req.encoding = ja.AudioEncoding.LINEAR_PCM     # Supports LINEAR_PCM, FLAC, MULAW and ALAW audio encodings
req.sample_rate_hz = 22050                     # ignored, audio returned will be 22.05KHz
req.voice_name = "ljspeech"                    # ignored

resp = jarvis_tts.Synthesize(req)
audio_samples = np.frombuffer(resp.audio, dtype=np.float32)
ipd.Audio(audio_samples, rate=22050)

When executing it I get following error:

_InactiveRpcError                         Traceback (most recent call last)
<ipython-input-4-dcdb366acd73> in <module>
     32 req.voice_name = "ljspeech"                    # ignored
     33 
---> 34 resp = jarvis_tts.Synthesize(req)
     35 #audio_samples = np.frombuffer(resp.audio, dtype=np.float32)
     36 #ipd.Audio(audio_samples, rate=22050)

/usr/local/lib/python3.6/dist-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
    921         state, call, = self._blocking(request, timeout, metadata, credentials,
    922                                       wait_for_ready, compression)
--> 923         return _end_unary_response_blocking(state, call, False, None)
    924 
    925     def with_call(self,

/usr/local/lib/python3.6/dist-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
    824             return state.response
    825     else:
--> 826         raise _InactiveRpcError(state)
    827 
    828 

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Error: TRTIS model failed during inference."
	debug_error_string = "{"created":"@1620990097.534903562","description":"Error received from peer ipv4:127.0.0.1:50051","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"Error: TRTIS model failed during inference.","grpc_status":2}"