Hello. I want to setup riva but I get an error. Do you know what is causing and how can solve the problem?
Hardware - GPU (GEFORCE RTX 3060)
Hardware - CPU core i7
Operating System - Ubuntu 20.04
Riva Version - v1.4.0-beta
- $ bash riva_init.sh - done.
- $ bash riva_start.sh - an error occured and timed out.
$ bash riva_start.sh
Starting Riva Speech Services. This may take several minutes depending on the number of models deployed.
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Waiting for Riva server to load all models…retrying in 10 seconds
Health ready check failed.
Check Riva logs with: docker logs riva-speech
nvidia-smt
±----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … Off | 00000000:01:00.0 Off | N/A |
| N/A 37C P0 14W / N/A | 10MiB / 5946MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1172 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 1992 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+
docker logs riva-speech
==========================
=== Riva Speech Skills ===NVIDIA Release 21.07 (build 25292380)
Copyright (c) 2018-2021, NVIDIA CORPORATION. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for the inference server. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 …Riva waiting for Triton server to load all models…retrying in 1 second
I0827 04:44:08.833089 70 metrics.cc:228] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
I0827 04:44:08.881955 70 onnxruntime.cc:1722] TRITONBACKEND_Initialize: onnxruntime
I0827 04:44:08.882453 70 onnxruntime.cc:1732] Triton TRITONBACKEND API version: 1.0
I0827 04:44:08.882457 70 onnxruntime.cc:1738] ‘onnxruntime’ TRITONBACKEND API version: 1.0
I0827 04:44:09.060622 70 pinned_memory_manager.cc:206] Pinned memory pool is created at ‘0x7f6204000000’ with size 268435456
I0827 04:44:09.061990 70 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 1000000000
E0827 04:44:09.082852 70 model_repository_manager.cc:1946] Poll failed for model directory ‘riva-trt-riva_punctuation-nn-bert-base-uncased’: failed to open text file for read /data/models/riva-trt-riva_punctuation-nn-bert-base-uncased/config.pbtxt: No such file or directory
I0827 04:44:09.084114 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0827 04:44:09.184510 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0827 04:44:09.184942 70 custom_backend.cc:201] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming_0_0_gpu0 on GPU 0 (8.6) using libtriton_riva_asr_features.so
I0827 04:44:09.284805 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0827 04:44:09.285018 70 custom_backend.cc:198] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming_0_0_cpu on CPU using libtriton_riva_asr_decoder_cpu.so
W:parameter_parser.cc:106: Parameter forerunner_start_offset_ms could not be set from parameters
W:parameter_parser.cc:107: Default value will be used
W:parameter_parser.cc:106: Parameter voc_string could not be set from parameters
W:parameter_parser.cc:107: Default value will be used
Riva waiting for Triton server to load all models…retrying in 1 second
I0827 04:44:09.385032 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0827 04:44:09.385332 70 custom_backend.cc:198] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline_0_0_cpu on CPU using libtriton_riva_asr_vad.so
I0827 04:44:09.452584 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline’ version 1
I0827 04:44:09.485357 70 model_repository_manager.cc:1066] loading: riva-trt-citrinet-1024:1
I0827 04:44:09.485657 70 custom_backend.cc:198] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming_0_0_cpu on CPU using libtriton_riva_asr_vad.so
I0827 04:44:09.540997 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming’ version 1
I0827 04:44:09.585618 70 model_repository_manager.cc:1066] loading: riva_tokenizer:1
I0827 04:44:09.685859 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0827 04:44:09.686130 70 custom_backend.cc:198] Creating instance riva_tokenizer_0_0_cpu on CPU using libtriton_riva_nlp_tokenizer.so
I0827 04:44:09.786237 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0827 04:44:09.786477 70 custom_backend.cc:198] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline_0_0_cpu on CPU using libtriton_riva_asr_decoder_cpu.so
W:parameter_parser.cc:106: Parameter forerunner_start_offset_ms could not be set from parameters
W:parameter_parser.cc:107: Default value will be used
W:parameter_parser.cc:106: Parameter voc_string could not be set from parameters
W:parameter_parser.cc:107: Default value will be used
I0827 04:44:09.886860 70 custom_backend.cc:201] Creating instance citrinet-1024-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline_0_0_gpu0 on GPU 0 (8.6) using libtriton_riva_asr_features.so
I0827 04:44:09.910864 70 model_repository_manager.cc:1240] successfully loaded ‘riva_tokenizer’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
W0827 04:44:10.836003 70 metrics.cc:292] failed to get power limit for GPU 0: Not Supported
I0827 04:44:11.098119 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming’ version 1
I0827 04:44:11.298671 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
W0827 04:44:12.837705 70 metrics.cc:292] failed to get power limit for GPU 0: Not Supported
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
W0827 04:44:14.841930 70 metrics.cc:292] failed to get power limit for GPU 0: Not Supported
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0827 04:44:18.869097 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline’ version 1
I0827 04:44:18.869124 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming’ version 1
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
I0827 04:44:20.590049 70 plan_backend.cc:384] Creating instance riva-trt-citrinet-1024_0_0_gpu0 on GPU 0 (8.6) using model.plan
Riva waiting for Triton server to load all models…retrying in 1 second
I0827 04:44:21.623610 70 plan_backend.cc:768] Created instance riva-trt-citrinet-1024_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0827 04:44:21.630624 70 model_repository_manager.cc:1240] successfully loaded ‘riva-trt-citrinet-1024’ version 1
I0827 04:44:21.631136 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming:1
I0827 04:44:21.731656 70 model_repository_manager.cc:1066] loading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline:1
I0827 04:44:21.832078 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming’ version 1
I0827 04:44:21.832416 70 model_repository_manager.cc:1240] successfully loaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline’ version 1
I0827 04:44:21.832599 70 server.cc:504]
±-----------------±-----+
| Repository Agent | Path |
±-----------------±-----+
±-----------------±-----+I0827 04:44:21.832682 70 server.cc:543]
±------------±----------------------------------------------------------------±-------+
| Backend | Path | Config |
±------------±----------------------------------------------------------------±-------+
| tensorrt | | {} |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} |
±------------±----------------------------------------------------------------±-------+I0827 04:44:21.832843 70 server.cc:586]
±---------------------------------------------------------------------------------------------------±--------±-------+
| Model | Version | Status |
±---------------------------------------------------------------------------------------------------±--------±-------+
| citrinet-1024-asr-trt-ensemble-vad-streaming | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-offline | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline | 1 | READY |
| citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming | 1 | READY |
| riva-trt-citrinet-1024 | 1 | READY |
| riva_tokenizer | 1 | READY |
±---------------------------------------------------------------------------------------------------±--------±-------+I0827 04:44:21.833068 70 tritonserver.cc:1658]
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.9.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0] | /data/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 1000000000 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+I0827 04:44:21.833095 70 server.cc:234] Waiting for in-flight requests to complete.
I0827 04:44:21.833129 70 model_repository_manager.cc:1099] unloading: riva_tokenizer:1
I0827 04:44:21.833216 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming:1
I0827 04:44:21.833405 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline:1
I0827 04:44:21.833558 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline:1
I0827 04:44:21.833727 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline:1
I0827 04:44:21.834200 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming:1
I0827 04:44:21.834347 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline’ version 1
I0827 04:44:21.834649 70 model_repository_manager.cc:1099] unloading: riva-trt-citrinet-1024:1
I0827 04:44:21.834740 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline:1
I0827 04:44:21.835464 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming:1
I0827 04:44:21.835920 70 model_repository_manager.cc:1099] unloading: citrinet-1024-asr-trt-ensemble-vad-streaming:1
I0827 04:44:21.836156 70 server.cc:249] Timeout 30: Found 9 live models and 0 in-flight non-inference requests
I0827 04:44:21.836378 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming’ version 1
I0827 04:44:21.839485 70 model_repository_manager.cc:1223] successfully unloaded ‘riva_tokenizer’ version 1
I0827 04:44:21.848595 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-voice-activity-detector-ctc-streaming’ version 1
I0827 04:44:21.848859 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline-voice-activity-detector-ctc-streaming-offline’ version 1
I0827 04:44:21.862569 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-feature-extractor-streaming’ version 1
I0827 04:44:21.879011 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline-feature-extractor-streaming-offline’ version 1
I0827 04:44:21.879243 70 model_repository_manager.cc:1223] successfully unloaded ‘riva-trt-citrinet-1024’ version 1
I0827 04:44:22.064240 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-offline-ctc-decoder-cpu-streaming-offline’ version 1
I0827 04:44:22.071154 70 model_repository_manager.cc:1223] successfully unloaded ‘citrinet-1024-asr-trt-ensemble-vad-streaming-ctc-decoder-cpu-streaming’ version 1Riva waiting for Triton server to load all models…retrying in 1 second
I0827 04:44:22.836629 70 server.cc:249] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
Riva waiting for Triton server to load all models…retrying in 1 second
Riva waiting for Triton server to load all models…retrying in 1 second
Triton server died before reaching ready state. Terminating Riva startup.
Check Triton logs with: docker logs
/opt/riva/bin/start-riva: line 1: kill: (70) - No such process
Thanks.