Hello,
Could someone help with this issue on the Nvidia AGX Orin Dev Kit after executing the command bash riva_start.sh
?
I0930 19:25:34.537914 20 model_lifecycle.cc:578] successfully unloaded 'riva-onnx-fastpitch_encoder-English-US' version 1
I0930 19:25:34.537982 20 model_lifecycle.cc:578] successfully unloaded 'conformer-en-US-asr-streaming-feature-extractor-streaming' version 1
I0930 19:25:34.566979 20 model_lifecycle.cc:578] successfully unloaded 'tts_preprocessor-English-US' version 1
I0930 19:25:35.343146 20 ctc-decoder-library.cc:24] TRITONBACKEND_ModelFinalize: delete model state
I0930 19:25:35.345675 20 model_lifecycle.cc:578] successfully unloaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1
I0930 19:25:35.503639 20 server.cc:302] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
Thanks!
1 Like
The output after bash riva_init.sh
CLI_VERSION: Latest - 3.29.0 available (current: 3.26.0). Please update by using the command 'ngc version upgrade'
Getting files to download...
⠼ ━╺ • … • Remaining: 0… • … • Elapsed: 0… • Total: 1 - Completed: 0 - Failed: 1
… …
--------------------------------------------------------------------------------
Download status: FAILED
Downloaded local path model: /data/artifacts/models_asr_conformer_en_us_str_v2.13.0-tegra-orin
Total files downloaded: 0
Total transferred: 0 B
Started at: 2023-09-30 19:04:58
Completed at: 2023-09-30 19:10:56
Duration taken: 5m 57s
--------------------------------------------------------------------------------
Getting files to download...
⠏ ━╺ • … • Remaining: 0… • … • Elapsed: 0… • Total: 1 - Completed: 0 - Failed: 1
… …
--------------------------------------------------------------------------------
Download status: FAILED
Downloaded local path model: /data/artifacts/models_nlp_punctuation_bert_base_en_us_v2.13.0-tegra-orin
Total files downloaded: 0
Total transferred: 0 B
Started at: 2023-09-30 19:11:01
Completed at: 2023-09-30 19:17:27
Duration taken: 6m 26s
--------------------------------------------------------------------------------
Getting files to download...
━━ • … • Remaining: 0… • … • Elapsed: 0… • Total: 1 - Completed: 1 - Failed: 0
… …
--------------------------------------------------------------------------------
Download status: COMPLETED
Downloaded local path model: /data/artifacts/models_tts_fastpitch_hifigan_en_us_ipa_v2.13.0-tegra-orin
Total files downloaded: 1
Total transferred: 186.53 MB
Started at: 2023-09-30 19:17:32
Completed at: 2023-09-30 19:22:25
Duration taken: 4m 53s
--------------------------------------------------------------------------------
+ [[ tegra != \t\e\g\r\a ]]
+ [[ tegra == \t\e\g\r\a ]]
+ '[' -d /mnt/storage/Projects/resources/riva_quickstart_arm64_v2.13.0/model_repository/rmir ']'
+ [[ tegra == \t\e\g\r\a ]]
+ '[' -d /mnt/storage/Projects/resources/riva_quickstart_arm64_v2.13.0/model_repository/prebuilt ']'
+ echo 'Converting prebuilts at /mnt/storage/Projects/resources/riva_quickstart_arm64_v2.13.0/model_repository/prebuilt to Riva Model repository.'
Converting prebuilts at /mnt/storage/Projects/resources/riva_quickstart_arm64_v2.13.0/model_repository/prebuilt to Riva Model repository.
+ docker run -it -d --rm -v /mnt/storage/Projects/resources/riva_quickstart_arm64_v2.13.0/model_repository:/data --name riva-models-extract nvcr.io/nvidia/riva/riva-speech:2.13.0-l4t-aarch64
+ docker exec riva-models-extract bash -c 'mkdir -p /data/models; \
for file in /data/prebuilt/*.tar.gz; do tar xf $file -C /data/models/ &> /dev/null; done'
+ docker container stop riva-models-extract
+ '[' 0 -ne 0 ']'
+ echo
+ echo 'Riva initialization complete. Run ./riva_start.sh to launch services.'
Riva initialization complete. Run ./riva_start.sh to launch services.
i have same problem. I am trying to inference Mixtral model using vllm.
i have tried removing unnecessary files (i got this recommendation from other Forum pages) but no use.
Does anyone found solution for this?
arghya
August 4, 2024, 1:02pm
5
@shahizat , I’m facing a similar issue, to get to the bottom of it, you might want to check the logs for riva-model-init in titron container. Since that’s the first step that runs, downloads all relevant models and creates the binaries which riva-api need for inferencing service.
kubectl logs triton0-64594899c8-z4l5z -c riva-model-init
In my case, the error is similar, and the model-init container fails because of
2024-08-04 13:02:32,656 [ERROR] Failed decryption. Please provide decryption key on the command line. model.rmir:ENCRYPTION_KEY