Tried on two different systems and riva quickstart keeps failing to launch.
Hi @ryein
Thanks for your interest in Riva
Can you please share with us the
- whether is it riva or riva-embedded
- config.sh used
-
config.sh
used - complete log output of
bash riva_init.sh
- complete log output of
bash riva_start.sh
Thanks
Not sure if it’s same issue, but I also had trouble with:
- riva_quickstart_v2.8.1
- riva - not embeded
- config.sh - unmodified/default
riva_init.sh was failing with:
....
To install the open-source samples corresponding to this TensorRT release version
run /opt/tensorrt/install_opensource.sh. To build the open source parsers,
plugins, and samples for current top-of-tree on master or a different branch,
run /opt/tensorrt/install_opensource.sh -b <branch>
See https://github.com/NVIDIA/TensorRT for more information.
ERROR: No supported GPU(s) detected to run this container
Failed to detect NVIDIA driver version.
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
TensorRT is not available! Will use ONNX backend instead.
2023-01-02 21:17:04,748 [INFO] Writing Riva model repository to '/data/models'...
2023-01-02 21:17:04,748 [INFO] The riva model repo target directory is /data/models
2023-01-02 21:17:06,252 [INFO] Using onnx runtime
2023-01-02 21:17:06,253 [INFO] Extract_binaries for language_model -> /data/models/riva-onnx-riva_text_classification_domain-nn-bert-base-uncased/1
2023-01-02 21:17:06,253 [INFO] extracting {'ckpt': ('nemo.collections.nlp.models.text_classification.text_classification_model.TextClassificationModel', 'model_weights.ckpt'), 'bert_config_file': ('nemo.collections.nlp.models.text_classification.text_classification_model.TextClassificationModel', 'bert-base-uncased_encoder_config.json')} -> /data/models/riva-onnx-riva_text_classification_domain-nn-bert-base-uncased/1
2023-01-02 21:17:07,806 [INFO] Printing copied artifacts:
2023-01-02 21:17:07,806 [INFO] {'ckpt': '/data/models/riva-onnx-riva_text_classification_domain-nn-bert-base-uncased/1/model_weights.ckpt', 'bert_config_file': '/data/models/riva-onnx-riva_text_classification_domain-nn-bert-base-uncased/1/bert-base-uncased_encoder_config.json'}
2023-01-02 21:17:07,806 [ERROR] Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/servicemaker/cli/deploy.py", line 100, in deploy_from_rmir
generator.serialize_to_disk(
File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 445, in serialize_to_disk
module.serialize_to_disk(repo_dir, rmir, config_only, verbose, overwrite)
File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 311, in serialize_to_disk
self.update_binary(version_dir, rmir, verbose)
File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 757, in update_binary
self.update_binary_from_copied(version_dir, rmir, copied, verbose)
File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 734, in update_binary_from_copied
raise Exception("Need TRT and bert_config_file for ckpt model")
Exception: Need TRT and bert_config_file for ckpt model
+ '[' 1 -ne 0 ']'
+ echo 'Error in deploying RMIR models.'
Error in deploying RMIR models.
+ exit 1
What I did to get it to proceed was to modify the docker run calls in riva_init.sh to use “–privileged” anywhere that call also used --gpus (there were 2 of such calls)
Context:
- I’m running fresh install of PopOS 22.04/Ubuntu 22.04
- Docker version 20.10.22, build 3a2c30b
- nvidia docker v2.11.0
- I’m using a non-root user to access my docker, but he is in the docker group
It now seems to be doing a whole lot of something… and I can see it’s using my GPU resources, so, that’s good. (not sure if I will run into issues with riva_start.sh yet, I have not gotten that far)
Hopefully that helps someone…
follow up on my previous post - I had to do the opposite on riva_start.sh - by adding “–gpus all” to the docker command which only had “–privileged” in it…
but then it did all seem to work, and I was able to run examples.
Note: I’m on a 4090, which doesn’t have enough ram to run all models at same time either, so I also disabled nlp and tts services at the top of config.sh and did a riva_clean.sh and then re-initialized.
Thanks for the info. I for sure need to play with it more. I’m sure it was user error.
From what I can tell, Riva will not yet run gpu’s beyond the Ampere architecture (30 series). If true, Riva does not support the 40 series. Note in your log above:
ERROR: No supported GPU(s) detected to run this container
Thanks so much @rbgreenway and @jason.grey
Thanks for your kind inputs, Really appreciate
Sincere Apologies for the long delay
I will confirm
- whether 40 series cards are supported by Riva
- “”"
riva_start.sh - by adding “–gpus all” to the docker command which only had “–privileged” in it
“”"
I will share this feedback with internal team and get the inputs
Thanks