Hi all, I am having issue with installing Jarvis v1.2.0-beta on Ubuntu 20.04.2 with the Jarvis Quickstart.
After running sudo bash jarvis_init.sh
it says “Jarvis initialisation complete”. However, when I run sudo bash jarvis_start.sh
, Jarvis server cannot load all the model. By looking at the docker logs, I think it might have something to do with failing to install jarvis-trt-waveglow
.
I have made no changes to the config.sh file.
Here is the output of docker logs jarvis-speech
:
==========================
== Jarvis Speech Skills ==
==========================
NVIDIA Release 21.05 (build 23684531)
Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:10.744186 75 metrics.cc:228] Collecting metrics for GPU 0: NVIDIA Quadro RTX 3000 with Max-Q Design
I0611 02:29:10.916648 75 pinned_memory_manager.cc:206] Pinned memory pool is created at '0x7f1c4e000000' with size 268435456
I0611 02:29:10.917305 75 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 1000000000
E0611 02:29:10.926068 75 model_repository_manager.cc:1946] Poll failed for model directory 'jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased': failed to open text file for read /data/models/jarvis-trt-jarvis_text_classification_domain-nn-bert-base-uncased/config.pbtxt: No such file or directory
I0611 02:29:10.930549 75 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0611 02:29:11.031248 75 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0611 02:29:11.032188 75 custom_backend.cc:201] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_asr_features.so
I0611 02:29:11.131601 75 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0611 02:29:11.131897 75 custom_backend.cc:198] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_decoder_cpu.so
I0611 02:29:11.231859 75 model_repository_manager.cc:1066] loading: jarvis-trt-citrinet-1024:1
I0611 02:29:11.232112 75 custom_backend.cc:198] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming_0_0_cpu on CPU using libtriton_jarvis_asr_vad.so
I0611 02:29:11.283928 75 model_repository_manager.cc:1240] successfully loaded 'citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming' version 1
I0611 02:29:11.332063 75 model_repository_manager.cc:1066] loading: jarvis-trt-tacotron2_encoder:1
I0611 02:29:11.432315 75 model_repository_manager.cc:1066] loading: jarvis-trt-waveglow:1
I0611 02:29:11.532585 75 model_repository_manager.cc:1066] loading: jarvis_tokenizer:1
I0611 02:29:11.632892 75 model_repository_manager.cc:1066] loading: tacotron2_decoder_postnet:1
I0611 02:29:11.633249 75 custom_backend.cc:198] Creating instance jarvis_tokenizer_0_0_cpu on CPU using libtriton_jarvis_nlp_tokenizer.so
I0611 02:29:11.696349 75 model_repository_manager.cc:1240] successfully loaded 'jarvis_tokenizer' version 1
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:11.733491 75 model_repository_manager.cc:1066] loading: tts_preprocessor:1
I0611 02:29:11.736443 75 tacotron-decoder-postnet.cc:873] TRITONBACKEND_ModelInitialize: tacotron2_decoder_postnet (version 1)
I0611 02:29:11.739230 75 tacotron-decoder-postnet.cc:767] model configuration:
{
"name": "tacotron2_decoder_postnet",
"platform": "",
"backend": "jarvis_tts_taco_postnet",
"version_policy": {
"latest": {
"num_versions": 1
}
},
"max_batch_size": 8,
"input": [
{
"name": "input_decoder",
"data_type": "TYPE_FP32",
"format": "FORMAT_NONE",
"dims": [
1,
400,
512
],
"is_shape_tensor": false,
"allow_ragged_batch": false
},
{
"name": "input_processed_decoder",
"data_type": "TYPE_FP32",
"format": "FORMAT_NONE",
"dims": [
400,
128,
1,
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false
},
{
"name": "input_num_characters",
"data_type": "TYPE_INT32",
"format": "FORMAT_NONE",
"dims": [
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false
}
],
"output": [
{
"name": "spectrogram_chunk",
"data_type": "TYPE_FP32",
"dims": [
1,
80,
80
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "z",
"data_type": "TYPE_FP32",
"dims": [
8,
2656,
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "num_valid_samples",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "end_flag",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
}
],
"batch_input": [],
"batch_output": [],
"optimization": {
"priority": "PRIORITY_DEFAULT",
"input_pinned_memory": {
"enable": true
},
"output_pinned_memory": {
"enable": true
},
"gather_kernel_buffer_threshold": 0,
"eager_batching": false
},
"sequence_batching": {
"oldest": {
"max_candidate_sequences": 8,
"preferred_batch_size": [
8
],
"max_queue_delay_microseconds": 100
},
"max_sequence_idle_microseconds": 60000000,
"control_input": [
{
"name": "START",
"control": [
{
"kind": "CONTROL_SEQUENCE_START",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "READY",
"control": [
{
"kind": "CONTROL_SEQUENCE_READY",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "END",
"control": [
{
"kind": "CONTROL_SEQUENCE_END",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "CORRID",
"control": [
{
"kind": "CONTROL_SEQUENCE_CORRID",
"int32_false_true": [],
"fp32_false_true": [],
"data_type": "TYPE_UINT64"
}
]
}
]
},
"instance_group": [
{
"name": "tacotron2_decoder_postnet_0",
"kind": "KIND_GPU",
"count": 1,
"gpus": [
0
],
"profile": []
}
],
"default_model_filename": "",
"cc_model_filenames": {},
"metric_tags": {},
"parameters": {
"num_samples_per_frame": {
"string_value": "256"
},
"z_dim0": {
"string_value": "8"
},
"tacotron_decoder_engine": {
"string_value": "/data/models/tacotron2_decoder_postnet/1/model.plan"
},
"num_mels": {
"string_value": "80"
},
"encoding_dimension": {
"string_value": "512"
},
"z_dim1": {
"string_value": "2656"
},
"max_execution_batch_size": {
"string_value": "8"
},
"chunk_length": {
"string_value": "80"
},
"max_input_length": {
"string_value": "400"
},
"attention_dimension": {
"string_value": "128"
}
},
"model_warmup": [],
"model_transaction_policy": {
"decoupled": true
}
}
I0611 02:29:11.739447 75 tacotron-decoder-postnet.cc:927] TRITONBACKEND_ModelInstanceInitialize: tacotron2_decoder_postnet_0 (device 0)
I0611 02:29:11.833734 75 model_repository_manager.cc:1066] loading: waveglow_denoiser:1
I0611 02:29:11.834481 75 custom_backend.cc:201] Creating instance tts_preprocessor_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_tts_preprocessor.so
I0611 02:29:11.841373 75 model_repository_manager.cc:1240] successfully loaded 'tts_preprocessor' version 1
I0611 02:29:11.934457 75 custom_backend.cc:201] Creating instance waveglow_denoiser_0_0_gpu0 on GPU 0 (7.5) using libtriton_jarvis_tts_denoiser.so
I0611 02:29:12.457702 75 model_repository_manager.cc:1240] successfully loaded 'citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming' version 1
> Jarvis waiting for Triton server to load all models...retrying in 1 second
W0611 02:29:12.745213 75 metrics.cc:292] failed to get power limit for GPU 0: Not Supported
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
W0611 02:29:14.747740 75 metrics.cc:292] failed to get power limit for GPU 0: Not Supported
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
W0611 02:29:16.750214 75 metrics.cc:292] failed to get power limit for GPU 0: Not Supported
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:26.085057 75 model_repository_manager.cc:1240] successfully loaded 'citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming' version 1
I0611 02:29:26.264476 75 plan_backend.cc:384] Creating instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 (7.5) using model.plan
I0611 02:29:26.371046 75 model_repository_manager.cc:1240] successfully loaded 'waveglow_denoiser' version 1
I0611 02:29:26.372714 75 plan_backend.cc:772] Created instance jarvis-trt-tacotron2_encoder_0_0_gpu0 on GPU 0 with stream priority 0
I0611 02:29:26.378955 75 model_repository_manager.cc:1240] successfully loaded 'jarvis-trt-tacotron2_encoder' version 1
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:27.796439 75 plan_backend.cc:384] Creating instance jarvis-trt-citrinet-1024_0_0_gpu0 on GPU 0 (7.5) using model.plan
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:28.539207 75 model_repository_manager.cc:1240] successfully loaded 'tacotron2_decoder_postnet' version 1
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:29.211907 75 plan_backend.cc:768] Created instance jarvis-trt-citrinet-1024_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0611 02:29:29.222198 75 model_repository_manager.cc:1240] successfully loaded 'jarvis-trt-citrinet-1024' version 1
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:30.183957 75 plan_backend.cc:384] Creating instance jarvis-trt-waveglow_0_0_gpu0 on GPU 0 (7.5) using model.plan
> Jarvis waiting for Triton server to load all models...retrying in 1 second
E0611 02:29:31.128529 75 logging.cc:43] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
E0611 02:29:31.132046 75 logging.cc:43] FAILED_ALLOCATION: std::exception
E0611 02:29:31.174161 75 model_repository_manager.cc:1243] failed to load 'jarvis-trt-waveglow' version 1: Internal: unable to create TensorRT context
E0611 02:29:31.174547 75 model_repository_manager.cc:1431] Invalid argument: ensemble 'tacotron2_ensemble' depends on 'jarvis-trt-waveglow' which has no loaded version
I0611 02:29:31.174631 75 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming:1
I0611 02:29:31.275322 75 model_repository_manager.cc:1240] successfully loaded 'citrinet-1024-asr-trt-ensemble-vad-streaming' version 1
I0611 02:29:31.275466 75 server.cc:504]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0611 02:29:31.275542 75 server.cc:543]
+-------------------------+-----------------------------------------------------------------------------------------+--------+
| Backend | Path | Config |
+-------------------------+-----------------------------------------------------------------------------------------+--------+
| tensorrt | <built-in> | {} |
| jarvis_tts_taco_postnet | /opt/tritonserver/backends/jarvis_tts_taco_postnet/libtriton_jarvis_tts_taco_postnet.so | {} |
+-------------------------+-----------------------------------------------------------------------------------------+--------+
I0611 02:29:31.275689 75 server.cc:586]
+------------------------------------------------------------------------------------+---------+----------------------------------------------------------+
| Model | Version | Status |
+------------------------------------------------------------------------------------+---------+----------------------------------------------------------+
| citrinet-1024-asr-trt-ensemble-vad-streaming | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming | 1 | READY |
| jarvis-trt-citrinet-1024 | 1 | READY |
| jarvis-trt-tacotron2_encoder | 1 | READY |
| jarvis-trt-waveglow | 1 | UNAVAILABLE: Internal: unable to create TensorRT context |
| jarvis_tokenizer | 1 | READY |
| tacotron2_decoder_postnet | 1 | READY |
| tts_preprocessor | 1 | READY |
| waveglow_denoiser | 1 | READY |
+------------------------------------------------------------------------------------+---------+----------------------------------------------------------+
I0611 02:29:31.275851 75 tritonserver.cc:1658]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.9.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0] | /data/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 1000000000 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0611 02:29:31.275864 75 server.cc:234] Waiting for in-flight requests to complete.
I0611 02:29:31.275872 75 model_repository_manager.cc:1099] unloading: tacotron2_decoder_postnet:1
I0611 02:29:31.275936 75 model_repository_manager.cc:1099] unloading: tts_preprocessor:1
I0611 02:29:31.276052 75 model_repository_manager.cc:1099] unloading: jarvis_tokenizer:1
I0611 02:29:31.276408 75 model_repository_manager.cc:1099] unloading: waveglow_denoiser:1
I0611 02:29:31.276463 75 tacotron-decoder-postnet.cc:1000] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0611 02:29:31.276557 75 model_repository_manager.cc:1099] unloading: jarvis-trt-tacotron2_encoder:1
I0611 02:29:31.276673 75 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0611 02:29:31.276883 75 model_repository_manager.cc:1099] unloading: jarvis-trt-citrinet-1024:1
I0611 02:29:31.276999 75 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0611 02:29:31.277074 75 model_repository_manager.cc:1223] successfully unloaded 'tts_preprocessor' version 1
I0611 02:29:31.277188 75 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0611 02:29:31.277348 75 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming:1
I0611 02:29:31.277534 75 server.cc:249] Timeout 30: Found 9 live models and 0 in-flight non-inference requests
I0611 02:29:31.277850 75 model_repository_manager.cc:1223] successfully unloaded 'citrinet-1024-asr-trt-ensemble-vad-streaming' version 1
I0611 02:29:31.279871 75 model_repository_manager.cc:1223] successfully unloaded 'jarvis_tokenizer' version 1
I0611 02:29:31.282404 75 model_repository_manager.cc:1223] successfully unloaded 'citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming' version 1
I0611 02:29:31.283409 75 model_repository_manager.cc:1223] successfully unloaded 'jarvis-trt-tacotron2_encoder' version 1
I0611 02:29:31.284242 75 model_repository_manager.cc:1223] successfully unloaded 'waveglow_denoiser' version 1
I0611 02:29:31.288785 75 model_repository_manager.cc:1223] successfully unloaded 'citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming' version 1
I0611 02:29:31.291100 75 model_repository_manager.cc:1223] successfully unloaded 'jarvis-trt-citrinet-1024' version 1
I0611 02:29:31.478491 75 model_repository_manager.cc:1223] successfully unloaded 'citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming' version 1
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:32.277635 75 server.cc:249] Timeout 29: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:33.277750 75 server.cc:249] Timeout 28: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:34.277892 75 server.cc:249] Timeout 27: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:35.278036 75 server.cc:249] Timeout 26: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:36.278188 75 server.cc:249] Timeout 25: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:37.278330 75 server.cc:249] Timeout 24: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:38.278703 75 server.cc:249] Timeout 23: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:39.279092 75 server.cc:249] Timeout 22: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:40.279229 75 server.cc:249] Timeout 21: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:41.279357 75 server.cc:249] Timeout 20: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:42.279600 75 server.cc:249] Timeout 19: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:43.279781 75 server.cc:249] Timeout 18: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:44.279968 75 server.cc:249] Timeout 17: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:45.280209 75 server.cc:249] Timeout 16: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:46.280416 75 server.cc:249] Timeout 15: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:47.280598 75 server.cc:249] Timeout 14: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:48.280764 75 server.cc:249] Timeout 13: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:49.280947 75 server.cc:249] Timeout 12: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:50.281162 75 server.cc:249] Timeout 11: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:51.281379 75 server.cc:249] Timeout 10: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:52.281784 75 server.cc:249] Timeout 9: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:53.281963 75 server.cc:249] Timeout 8: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:54.282127 75 server.cc:249] Timeout 7: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:55.282304 75 server.cc:249] Timeout 6: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:56.282544 75 server.cc:249] Timeout 5: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:57.282767 75 server.cc:249] Timeout 4: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:58.282976 75 server.cc:249] Timeout 3: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:29:59.283205 75 server.cc:249] Timeout 2: Found 1 live models and 0 in-flight non-inference requests
> Jarvis waiting for Triton server to load all models...retrying in 1 second
I0611 02:30:00.283356 75 server.cc:249] Timeout 1: Found 1 live models and 0 in-flight non-inference requests
I0611 02:30:01.283508 75 server.cc:249] Timeout 0: Found 1 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Jarvis waiting for Triton server to load all models...retrying in 1 second
> Triton server died before reaching ready state. Terminating Jarvis startup.
Check Triton logs with: docker logs
/opt/jarvis/bin/start-jarvis: line 1: kill: (75) - No such process
Output of sudo bash jarvis_init.sh
:
Please enter API key for ngc.nvidia.com:
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Jarvis Speech Server images.
> Image nvcr.io/nvidia/jarvis/jarvis-speech:1.2.0-beta-server exists. Skipping.
> Image nvcr.io/nvidia/jarvis/jarvis-speech-client:1.2.0-beta exists. Skipping.
> Image nvcr.io/nvidia/jarvis/jarvis-speech:1.2.0-beta-servicemaker exists. Skipping.
Downloading models (JMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing JMIRs set the location and corresponding flag in config.sh.
==========================
== Jarvis Speech Skills ==
==========================
NVIDIA Release devel (build 22382700)
Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...
/data/artifacts /opt/jarvis
Directory jmir_punctuation_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_jarvis_asr_citrinet_1024_asrset1p7_streaming_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_jarvis_asr_citrinet_1024_asrset1p7_offline_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_punctuation_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_named_entity_recognition_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_intent_slot_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_question_answering_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_text_classification_v1.2.0-beta already exists, skipping. Use '--force' option to override.
Directory jmir_jarvis_tts_ljspeech_v1.2.0-beta already exists, skipping. Use '--force' option to override.
/opt/jarvis
Converting JMIRs at jarvis-model-repo/jmir to Jarvis Model repository.
+ docker run --init -it --rm --gpus '"device=0"' -v jarvis-model-repo:/data -e MODEL_DEPLOY_KEY=tlt_encode --name jarvis-service-maker nvcr.io/nvidia/jarvis/jarvis-speech:1.2.0-beta-servicemaker deploy_all_models /data/jmir /data/models
==========================
== Jarvis Speech Skills ==
==========================
NVIDIA Release devel (build 22382700)
Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...
Traceback (most recent call last):
File "/opt/conda/bin/jarvis-deploy", line 8, in <module>
sys.exit(deploy_from_jmir())
File "/opt/conda/lib/python3.8/site-packages/servicemaker/cli/deploy.py", line 73, in deploy_from_jmir
raise FileExistsError(f"{args.target} exists. Use --force/-f to overwrite.")
FileExistsError: /data/models exists. Use --force/-f to overwrite.
+ echo
+ echo 'Jarvis initialization complete. Run ./jarvis_start.sh to launch services.'
Jarvis initialization complete. Run ./jarvis_start.sh to launch services.
Output of sudo bash jarvis_start.sh
Starting Jarvis Speech Services. This may take several minutes depending on the number of models deployed.
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Waiting for Jarvis server to load all models...retrying in 10 seconds
Health ready check failed.
Check Jarvis logs with: docker logs jarvis-speech
What I have done to try and resolve the issue:
I have tried following the forum below, I commented out all the NLP models except one, and also commented out the TTS model, then re-run sudo bash jarvis_init.sh
and sudo bash jarvis_start.sh
, but it still doesn’t work.
I also tried removing the docker volume jarvis-model-repo
following the forum below.
But I could not remove the docker volume even with the force remove command docker volume rm -f jarvis-model-repo
, it outputs:
Error response from daemon: remove jarvis-model-repo: volume is in use - [8977df414a7b381d054433d1ea37861232d53812d7dfda2d6bfcfc7eb93fd436]
Not sure whether the information below are useful or not, but here they are
Here is the output of nvidia-smi
:
Fri Jun 11 11:50:51 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA Quadro R... On | 00000000:01:00.0 Off | N/A |
| N/A 48C P5 6W / N/A | 948MiB / 5934MiB | 29% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1112 G /usr/lib/xorg/Xorg 159MiB |
| 0 N/A N/A 1841 G /usr/lib/xorg/Xorg 358MiB |
| 0 N/A N/A 2032 G /usr/bin/gnome-shell 92MiB |
| 0 N/A N/A 2557 G ...AAAAAAAAA= --shared-files 109MiB |
| 0 N/A N/A 13952 G ...AAAAAAAAA= --shared-files 29MiB |
| 0 N/A N/A 24910 G ...AAAAAAAAA= --shared-files 182MiB |
+-----------------------------------------------------------------------------+
Output of nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
My docker version:
Client: Docker Engine - Community
Version: 20.10.7
API version: 1.41
Go version: go1.13.15
Git commit: f0df350
Built: Wed Jun 2 11:56:38 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.7
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: b0f5bc3
Built: Wed Jun 2 11:54:50 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.6
GitCommit: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc:
Version: 1.0.0-rc95
GitCommit: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Output of sudo apt install nvidia-cuda-toolkit
Reading package lists... Done
Building dependency tree
Reading state information... Done
nvidia-cuda-toolkit is already the newest version (10.1.243-3).
The following packages were automatically installed and are no longer required:
chromium-codecs-ffmpeg-extra gstreamer1.0-vaapi libgstreamer-plugins-bad1.0-0 libva-wayland2
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 211 not upgraded.
Output of uname -a
Linux ato-Precision-5750 5.10.0-1029-oem #30-Ubuntu SMP Fri May 28 23:53:50 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Output of lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 165
Model name: Intel(R) Core(TM) i7-10875H CPU @ 2.30GHz
Stepping: 2
CPU MHz: 858.548
CPU max MHz: 5100.0000
CPU min MHz: 800.0000
BogoMIPS: 4599.93
Virtualization: VT-x
L1d cache: 256 KiB
L1i cache: 256 KiB
L2 cache: 2 MiB
L3 cache: 16 MiB
NUMA node0 CPU(s): 0-15
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pd
pe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 moni
tor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c
rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept
vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsav
es dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp pku ospke md_clear flush_l1d arch_capabilities
Output of lspci | grep VGA
:
00:02.0 VGA compatible controller: Intel Corporation UHD Graphics (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation TU106GLM [Quadro RTX 3000 Mobile / Max-Q] (rev a1)
I am dual booting Ubuntu 20.04.2 on a Windows 10.
Thank you very much!