Please provide the following information when requesting support.
Hardware - GPU (A100)
Hardware - CPU
Operating System Ubuntu 22.04 and Centos8 - same results from each installation attempt.
Riva Version 2.5
TLT Version (if relevant)
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)
my installation process:
sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
sudo dnf repolist -v
sudo dnf install -y https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.4.3-3.1.el7.x86_64.rpm
sudo dnf install docker-ce -y
sudo systemctl --now enable docker
]# sudo docker run --rm hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
2db29710123e: Pull complete
Digest: sha256:7d246653d0511db2a6b2e0436cfd0e52ac8c066000264b3ce63331ac66dca625
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
yum-config-manager --enable libnvidia-container-experimental
sudo dnf clean expire-cache --refresh
sudo dnf install -y nvidia-docker2
sudo systemctl restart docker
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
]# sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Unable to find image 'nvidia/cuda:11.0.3-base-ubuntu20.04' locally
11.0.3-base-ubuntu20.04: Pulling from nvidia/cuda
d7bfe07ed847: Pull complete
75eccf561042: Pull complete
191419884744: Pull complete
a17a942db7e1: Pull complete
16156c70987f: Pull complete
Digest: sha256:57455121f3393b7ed9e5a0bc2b046f57ee7187ea9ec562a7d17bf8c97174040d
Status: Downloaded newer image for nvidia/cuda:11.0.3-base-ubuntu20.04
Fri Sep 2 21:23:56 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.08 Driver Version: 510.73.08 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GRID A100D-2-20C On | 00000000:04:00.0 Off | On |
| N/A N/A P0 N/A / N/A | N/A | N/A Default |
| | | Enabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG|
| | | ECC| |
|==================+======================+===========+=======================|
| 0 0 0 0 | 0MiB / 18411MiB | 28 0 | 2 0 1 0 0 |
| | 0MiB / 4096MiB | | |
+------------------+----------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_linux.zip && unzip ngccli_linux.zip && chmod u+x ngc-cli/ngc
echo “export PATH="$PATH:$(pwd)/ngc-cli"” >> ~/.bash_profile && source ~/.bash_profile
ngc config set
Successfully saved NGC configuration to /root/.ngc/config
ngc registry resource download-version nvidia/riva/riva_quickstart:2.5.0
cd riva_quickstart_v2.5.0
bash riva_init.sh
2022-09-02 21:34:00,957 [INFO] Extract_binaries for conformer-en-US-asr-offline -> /data/models/conformer-en-US-asr-offline/1
2022-09-02 21:34:00,957 [INFO] extracting {'wfst_tokenizer': '/mnt/nvdl/datasets/jarvis_speech_ci/model_files/sp-itn/22.05/en/tokenize_and_classify.far', 'wfst_verbalizer': '/mnt/nvdl/datasets/jarvis_speech_ci/model_files/sp-itn/22.05/en/verbalize.far'} -> /data/models/conformer-en-US-asr-offline/1
2022-09-02 21:34:04,468 [INFO] Using onnx runtime
2022-09-02 21:34:04,468 [INFO] Extract_binaries for language_model -> /data/models/riva-onnx-riva-punctuation-en-US-nn-bert-base-uncased/1
2022-09-02 21:34:04,468 [INFO] extracting {'ckpt': ('nemo.collections.nlp.models.token_classification.punctuation_capitalization_model.PunctuationCapitalizationModel', 'model_weights.ckpt'), 'bert_config_file': ('nemo.collections.nlp.models.token_classification.punctuation_capitalization_model.PunctuationCapitalizationModel', 'bert-base-uncased_encoder_config.json')} -> /data/models/riva-onnx-riva-punctuation-en-US-nn-bert-base-uncased/1
2022-09-02 21:34:09,076 [INFO] Printing copied artifacts:
2022-09-02 21:34:09,077 [INFO] {'ckpt': '/data/models/riva-onnx-riva-punctuation-en-US-nn-bert-base-uncased/1/model_weights.ckpt', 'bert_config_file': '/data/models/riva-onnx-riva-punctuation-en-US-nn-bert-base-uncased/1/bert-base-uncased_encoder_config.json'}
2022-09-02 21:34:09,077 [ERROR] Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/servicemaker/cli/deploy.py", line 100, in deploy_from_rmir
generator.serialize_to_disk(
File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 434, in serialize_to_disk
module.serialize_to_disk(repo_dir, rmir, config_only, verbose, overwrite)
File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 313, in serialize_to_disk
self.generate_config(version_dir, rmir)
File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 352, in generate_config
input=self._inputs,
AttributeError: 'RivaBertEncoder' object has no attribute '_inputs'
+ '[' 1 -ne 0 ']'
+ echo 'Error in deploying RMIR models.'
Error in deploying RMIR models.
+ exit 1
I am seeing this same results when installing RIVA 2.5 on fresh Ubuntu 22.04 and CentOS8 servers.
What am I missing?
Please advise. Thank you , kindly.