Jetson AGX Orin: riva_init.sh Fails

ab-tools · June 19, 2022, 4:01pm

Hardware: Nvidia Jetson AGX Orin Development Kit
Operating System: Ubuntu 20.04.4 LTS
Riva Version: Riva Skills Quick Start 2.2.1
Jetpack Version: 5.0.1
Docker Version: 20.10.12 (nvidia-docker installed)

Hello,

I’m trying to follow the Riva Skills Quick Start tutorial, but unfortunately I get already stuck when running the riva_init.sh which leads to following output (this is the full script log):

Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.

I did already try to run it again after a riva_clean.sh run with the same result.
My config.sh can be found here: config.sh (13.3 KB)

Only ASR is enabled in de-DE language, but I also tried it with en-US again with same result.

What am I doing wrong?

Best regards and thanks in advance
Andreas

rvinobha · June 21, 2022, 3:30pm

Hi @ab-tools

Thanks for your interest in Riva,

Apologies you are facing issues,

I will check on this issue/error further with the team and get back,

Thanks for your patience

ab-tools · June 21, 2022, 4:30pm

Thanks, @rvinobha, appreciating your quick reply!

I’ll be waiting then for further feedback from your team on this issue now.

Best regards and thanks for your support
Andreas

ab-tools · June 29, 2022, 4:12pm

Hello @rvinobha,

don’t want to bother you, but did you have a chance to check with your team regarding this issue?
Perhaps there is something we can further test on our side to help with the investigation?

We’re a bit stuck here at the moment as we cannot proceed with this project before getting Riva initialized on our AGX Jetson device as a start.

Best regards and thanks again for your support
Andreas

rvinobha · June 30, 2022, 4:54pm

Hi @ab-tools

Thanks for your interest in Riva,

Apologies for the delay, i will get more information,

I just have a quick suggestion, As de-DE model is used in config.sh

I recommend trying uncommenting line no 136 in config.sh, try and let us know if issue/error still persists

Thanks

ab-tools · June 30, 2022, 6:08pm

Hello @rvinobha,

first thanks a lot for your reply, much appreciated!

I’ve tried what you suggested and uncommented the line you mentioned. Please find a new config.sh attached for reference: config.sh (13.3 KB)

But unfortunately this also did not seem to resolve the issue - here the tests I did:

server@TestServer:~/riva_quickstart_v2.2.1$ vi config.sh 
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh 
[sudo] password for server: 
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: Container f83e679455c31a5937068c2777c6e6566052735060bb271f0fcedd9a8a9f611d is not running
Error in downloading models.
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh 
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Image nvcr.io/nvidia/riva/riva-speech:2.2.1-server exists. Skipping.

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_clean.sh 
Cleaning up local Riva installation.
Image nvcr.io/nvidia/riva/riva-speech:2.2.1-server found. Delete? [y/N] y
Image nvcr.io/nvidia/riva/riva-speech:2.2.1-servicemaker has not been downloaded, or has already been deleted.
Error: No such volume: /home/server/riva_quickstart_v2.2.1/model_repository
'/home/server/riva_quickstart_v2.2.1/model_repository' is not a Docker volume, or has already been deleted.
Found models at '/home/server/riva_quickstart_v2.2.1/model_repository'. Delete? [y/N] y
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh 
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.

As you can see I got once a different error message:

Error response from daemon: Container f83e679455c31a5937068c2777c6e6566052735060bb271f0fcedd9a8a9f611d is not running

But after re-running the script, even after running riva_clean.sh, I was always back to my previous error message:

Error response from daemon: No such container: riva-models-download

Maybe there is something very simple I’m missing here, but I can’t figure out what the problem is.
Especially also, why I got once now the error message that a certain container is not running and then on the next try again that it does not exist at all (even after a clean).

Please let me know if there is anything else I could test.

Best regards and again thanks a lot for your support
Andreas

rvinobha · July 4, 2022, 4:56pm

Hi @ab-tools

Thanks for your interest in Riva,

Thank you so much for your time and patience,

I have updates from the team,

Kindly rerun the NGC CLI setup and make sure you generate an API key.

The instructions can be found on the following link, under “ARM64 Linux”: NVIDIA NGC

During setup while running the command “ngc config set” it will be asking for an API key, that can be generated as described here: NGC Overview :: NVIDIA GPU Cloud Documentation

As a last step, Kindly ensure docker can run in privileged mode by issuing: sudo usermod -aG docker $USER

Finally please reboot the device after all the above steps, you should be able to run the riva_init.sh script.

Please let us know if you face any issues

Thanks

ab-tools · July 5, 2022, 10:18am

Hello @rvinobha,

thanks for your reply again.

I’ve tried what you said and did all the NGC CLI steps again (which I had done previously) and ran the docker related command.

Unfortunately, I still get this error message here upon running the init script:

:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: Container 401119e825022fb57382e50d3f157a8f30c6f0dd669521b1d76fa9392d4afcfa is not running
Error in downloading models.

Do you have any other ideas what I could try/investigate?

Best regards and appreciating your support
Andreas

rvinobha · July 13, 2022, 8:44am

Hi @ab-tools

Thanks for your interest in Riva

Apologies for the delay,

I am checking further with the team, we are guessing to be docker or NGC related issue

Verification of API key,
kindly run the command cat $HOME/.ngc/config (NOTE, Please do not edit or change the file)
Please verify and confirm the apikey present in the file is the latest one (last generated from portal)
Docker Verification, kindly run the below command
sudo usermod -aG docker $USER
sudo docker login nvcr.io
when prompted for username please type $oauthtoken
when prompted for your password, enter your NGC API KEY
If already set, then the Command should show Login Succeeded if evertything is fine
Please try to pull any sample image e.g

sudo docker pull nvcr.io/nvidia/tensorflow:18.06-py3

and verify image pulls successfully

Please try the installation by setting NGC_API_KEY in environment
export NGC_API_KEY=<your-api-key>
and try installation

Also when executing
ngc registry resource download-version nvidia/riva/riva_quickstart_arm64:2.3.0
It will output a information like below, if you can send that too it would be helpful
{
“download_end”: “2022-07-13 08:37:59.199892”,
“download_start”: “2022-07-13 08:37:55.193366”,
“download_time”: “4s”,
“files_downloaded”: 30,
“local_path”: “/home/test_sample/riva_quickstart_v2.3.0”,
“size_downloaded”: “85.86 KB”,
“status”: “Completed”,
“transfer_id”: “riva_quickstart_v2.3.0”
}

Thanks for your patience

ab-tools · July 13, 2022, 10:07am

Hello @rvinobha,

thanks for your detailed reply, unfortunately I still can’t get it working.

Regarding 1:
Confirmed correct API key is in the NGC config file.

Regarding 2:
I’ve run both commands and I can confirmed that I immediately got Login Succeeded without entering the API key as the correct one was already present:

server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo usermod -aG docker server
server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo docker login nvcr.io
Authenticating with existing credentials...
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

When running sudo docker pull nvcr.io/nvidia/tensorflow:18.06-py3 all succeeded as well:

server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo docker pull nvcr.io/nvidia/tensorflow:18.06-py3
18.06-py3: Pulling from nvidia/tensorflow
297061f60c36: Pull complete 
e9ccef17b516: Pull complete 
dbc33716854d: Pull complete 
8fe36b178d25: Pull complete 
686596545a94: Pull complete 
97becc2da853: Pull complete 
4691961fccc9: Pull complete 
8281fad90fcd: Pull complete 
e173493a7585: Pull complete 
4cc321c55676: Pull complete 
1282ed316170: Pull complete 
a1bcd0b40bfb: Pull complete 
ce951825002a: Pull complete 
3992697887ca: Pull complete 
a2580d50482e: Pull complete 
734ab5a9097a: Pull complete 
d805760875e3: Pull complete 
d4d863625491: Pull complete 
cb1ba57aaad1: Pull complete 
eabd681cb1d8: Pull complete 
ff7e9a510db5: Pull complete 
3ce05ba4ee2a: Pull complete 
7d3ca913a0cb: Pull complete 
feab07a5a6c4: Pull complete 
6edf251b3f8e: Pull complete 
68e095703e23: Pull complete 
093323c5e0eb: Pull complete 
0c841ffa800c: Pull complete 
7a0d06184de5: Pull complete 
ef362687c1a5: Pull complete 
d6bbf97e9e09: Pull complete 
534c66ee05c8: Pull complete 
e3f4a858e69c: Pull complete 
489e4b1c5775: Pull complete 
1df65788cbec: Pull complete 
5f8cdcbd8fe9: Pull complete 
ef7b68c398c6: Pull complete 
2a3259313c42: Pull complete 
08b3543110eb: Pull complete 
Digest: sha256:f6ae3be0464c8e4a0558343f9de1123d1962c37574ff674ebdbb967deabbdd32
Status: Downloaded newer image for nvcr.io/nvidia/tensorflow:18.06-py3
nvcr.io/nvidia/tensorflow:18.06-py3

Regarding 3:
I’ve exported the API key as requested and downloaded the new 2.3.0 version of Riva QuickStart with following output:

server@AB-Server:/data$ ngc registry resource download-version "nvidia/riva/riva_quickstart:2.3.0"
Downloaded 85.86 KB in 7s, Download speed: 12.25 KB/s               
----------------------------------------------------
Transfer id: riva_quickstart_v2.3.0 Download status: Completed.
Downloaded local path: /data/riva_quickstart_v2.3.0
Total files downloaded: 30 
Total downloaded size: 85.86 KB
Started at: 2022-07-13 11:29:12.087344
Completed at: 2022-07-13 11:29:19.101574
Duration taken: 7s
----------------------------------------------------

Then I ran riva_init.sh again, but unfortunately this also led to the same error message again:

server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo bash riva_init.sh
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.3.0-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.

As I applied the changes to the new config file of the 2.3.0 version manually again, please find here the new config.sh as a reference, just to be sure I didn’t mess something up there: config.sh (14.1 KB)

Before I did all this I’ve run sudo apt update and sudo apt upgrade to ensure all packages are up-to-date.
All system updates succeeded without errors.

Not sure what could be special/wrong on the system here, really appreciating all your support.

Best regards
Andreas

rvinobha · July 13, 2022, 5:54pm

Hi @ab-tools

Thanks for your interest in Riva,

My Apologies, I totally missed it, Just a quick check, in the logs shared above we find below usage

ngc registry resource download-version nvidia/riva/riva_quickstart:2.3.0

The above should be used for Server, Since AGX Orin is an embedded device we need to use

ngc registry resource download-version nvidia/riva/riva_quickstart_arm64:2.3.0

We need to follow instructions for Embedded in the riva docs

let us know if it works

Thanks for patience

ab-tools · July 14, 2022, 10:28am

Hello @rvinobha,

you are fully right, thanks a lot!
Indeed when using the different command now everything seems to work fine. :-)

Best regards and thanks again for your support
Andreas

nadeemm · July 20, 2022, 8:26pm

Thanks for trying out the suggestion - and I am happy it worked out.
So I am marking this thread as solved and it will automatically close.

system · August 3, 2022, 8:26pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Riva Quickstart 2.4.0 installation fails on AGX Orin Riva	12	1190	September 13, 2022
Riva_start.sh fails on Jetson Orin Riva	8	1358	April 23, 2024
Riva quick start tutorial - server not starting on Jetson Orin NX Riva cudnn	0	558	December 20, 2023
NGC RMIRs Error in downloading models Riva riva	17	1104	February 26, 2024
Triton server died before reaching ready state. Terminating Riva startup Riva	15	7593	November 8, 2023
Riva 2.16 quick start error - riva_init.sh - invalid API key Riva ubuntu , nim	5	153	August 7, 2024
Riva quickstart 2.11 fails on xavier nx Riva	3	913	June 29, 2023
Jetson Xavier NX DevKit and Riva 2.10.0 Riva riva	14	1047	April 21, 2023
Riva_start.sh will not start the server Riva riva	4	1114	August 31, 2023
Riva_start.sh will not load the models Riva riva	3	1156	April 23, 2024

Jetson AGX Orin: riva_init.sh Fails

Related topics