Jetson AGX Orin: riva_init.sh Fails

Hardware: Nvidia Jetson AGX Orin Development Kit
Operating System: Ubuntu 20.04.4 LTS
Riva Version: Riva Skills Quick Start 2.2.1
Jetpack Version: 5.0.1
Docker Version: 20.10.12 (nvidia-docker installed)

Hello,

I’m trying to follow the Riva Skills Quick Start tutorial, but unfortunately I get already stuck when running the riva_init.sh which leads to following output (this is the full script log):

Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.

I did already try to run it again after a riva_clean.sh run with the same result.
My config.sh can be found here: config.sh (13.3 KB)

Only ASR is enabled in de-DE language, but I also tried it with en-US again with same result.

What am I doing wrong?

Best regards and thanks in advance
Andreas

Hi @ab-tools

Thanks for your interest in Riva,

Apologies you are facing issues,

I will check on this issue/error further with the team and get back,

Thanks for your patience

Thanks, @rvinobha, appreciating your quick reply!

I’ll be waiting then for further feedback from your team on this issue now.

Best regards and thanks for your support
Andreas

Hello @rvinobha,

don’t want to bother you, but did you have a chance to check with your team regarding this issue?
Perhaps there is something we can further test on our side to help with the investigation?

We’re a bit stuck here at the moment as we cannot proceed with this project before getting Riva initialized on our AGX Jetson device as a start.

Best regards and thanks again for your support
Andreas

Hi @ab-tools

Thanks for your interest in Riva,

Apologies for the delay, i will get more information,

I just have a quick suggestion, As de-DE model is used in config.sh

I recommend trying uncommenting line no 136 in config.sh, try and let us know if issue/error still persists

Thanks

Hello @rvinobha,

first thanks a lot for your reply, much appreciated!

I’ve tried what you suggested and uncommented the line you mentioned. Please find a new config.sh attached for reference: config.sh (13.3 KB)

But unfortunately this also did not seem to resolve the issue - here the tests I did:

server@TestServer:~/riva_quickstart_v2.2.1$ vi config.sh 
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh 
[sudo] password for server: 
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: Container f83e679455c31a5937068c2777c6e6566052735060bb271f0fcedd9a8a9f611d is not running
Error in downloading models.
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh 
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Image nvcr.io/nvidia/riva/riva-speech:2.2.1-server exists. Skipping.

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_clean.sh 
Cleaning up local Riva installation.
Image nvcr.io/nvidia/riva/riva-speech:2.2.1-server found. Delete? [y/N] y
Image nvcr.io/nvidia/riva/riva-speech:2.2.1-servicemaker has not been downloaded, or has already been deleted.
Error: No such volume: /home/server/riva_quickstart_v2.2.1/model_repository
'/home/server/riva_quickstart_v2.2.1/model_repository' is not a Docker volume, or has already been deleted.
Found models at '/home/server/riva_quickstart_v2.2.1/model_repository'. Delete? [y/N] y
server@TestServer:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh 
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.

As you can see I got once a different error message:

Error response from daemon: Container f83e679455c31a5937068c2777c6e6566052735060bb271f0fcedd9a8a9f611d is not running

But after re-running the script, even after running riva_clean.sh, I was always back to my previous error message:

Error response from daemon: No such container: riva-models-download

Maybe there is something very simple I’m missing here, but I can’t figure out what the problem is.
Especially also, why I got once now the error message that a certain container is not running and then on the next try again that it does not exist at all (even after a clean).

Please let me know if there is anything else I could test.

Best regards and again thanks a lot for your support
Andreas

Hi @ab-tools

Thanks for your interest in Riva,

Thank you so much for your time and patience,

I have updates from the team,

Kindly rerun the NGC CLI setup and make sure you generate an API key.

The instructions can be found on the following link, under “ARM64 Linux”: NVIDIA NGC

During setup while running the command “ngc config set” it will be asking for an API key, that can be generated as described here: NGC Overview :: NVIDIA GPU Cloud Documentation

As a last step, Kindly ensure docker can run in privileged mode by issuing: sudo usermod -aG docker $USER

Finally please reboot the device after all the above steps, you should be able to run the riva_init.sh script.

Please let us know if you face any issues

Thanks

Hello @rvinobha,

thanks for your reply again.

I’ve tried what you said and did all the NGC CLI steps again (which I had done previously) and ran the docker related command.

Unfortunately, I still get this error message here upon running the init script:

:~/riva_quickstart_v2.2.1$ sudo bash riva_init.sh
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.2.1-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: Container 401119e825022fb57382e50d3f157a8f30c6f0dd669521b1d76fa9392d4afcfa is not running
Error in downloading models.

Do you have any other ideas what I could try/investigate?

Best regards and appreciating your support
Andreas

Hi @ab-tools

Thanks for your interest in Riva

Apologies for the delay,

I am checking further with the team, we are guessing to be docker or NGC related issue

  1. Verification of API key,
    kindly run the command cat $HOME/.ngc/config (NOTE, Please do not edit or change the file)
    Please verify and confirm the apikey present in the file is the latest one (last generated from portal)

  2. Docker Verification, kindly run the below command
    sudo usermod -aG docker $USER
    sudo docker login nvcr.io
    when prompted for username please type $oauthtoken
    when prompted for your password, enter your NGC API KEY
    If already set, then the Command should show Login Succeeded if evertything is fine
    Please try to pull any sample image e.g

sudo docker pull nvcr.io/nvidia/tensorflow:18.06-py3

and verify image pulls successfully

  1. Please try the installation by setting NGC_API_KEY in environment
    export NGC_API_KEY=<your-api-key>
    and try installation

Also when executing
ngc registry resource download-version nvidia/riva/riva_quickstart_arm64:2.3.0
It will output a information like below, if you can send that too it would be helpful
{
“download_end”: “2022-07-13 08:37:59.199892”,
“download_start”: “2022-07-13 08:37:55.193366”,
“download_time”: “4s”,
“files_downloaded”: 30,
“local_path”: “/home/test_sample/riva_quickstart_v2.3.0”,
“size_downloaded”: “85.86 KB”,
“status”: “Completed”,
“transfer_id”: “riva_quickstart_v2.3.0”
}

Thanks for your patience

Hello @rvinobha,

thanks for your detailed reply, unfortunately I still can’t get it working.

Regarding 1:
Confirmed correct API key is in the NGC config file.

Regarding 2:
I’ve run both commands and I can confirmed that I immediately got Login Succeeded without entering the API key as the correct one was already present:

server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo usermod -aG docker server
server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo docker login nvcr.io
Authenticating with existing credentials...
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

When running sudo docker pull nvcr.io/nvidia/tensorflow:18.06-py3 all succeeded as well:

server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo docker pull nvcr.io/nvidia/tensorflow:18.06-py3
18.06-py3: Pulling from nvidia/tensorflow
297061f60c36: Pull complete 
e9ccef17b516: Pull complete 
dbc33716854d: Pull complete 
8fe36b178d25: Pull complete 
686596545a94: Pull complete 
97becc2da853: Pull complete 
4691961fccc9: Pull complete 
8281fad90fcd: Pull complete 
e173493a7585: Pull complete 
4cc321c55676: Pull complete 
1282ed316170: Pull complete 
a1bcd0b40bfb: Pull complete 
ce951825002a: Pull complete 
3992697887ca: Pull complete 
a2580d50482e: Pull complete 
734ab5a9097a: Pull complete 
d805760875e3: Pull complete 
d4d863625491: Pull complete 
cb1ba57aaad1: Pull complete 
eabd681cb1d8: Pull complete 
ff7e9a510db5: Pull complete 
3ce05ba4ee2a: Pull complete 
7d3ca913a0cb: Pull complete 
feab07a5a6c4: Pull complete 
6edf251b3f8e: Pull complete 
68e095703e23: Pull complete 
093323c5e0eb: Pull complete 
0c841ffa800c: Pull complete 
7a0d06184de5: Pull complete 
ef362687c1a5: Pull complete 
d6bbf97e9e09: Pull complete 
534c66ee05c8: Pull complete 
e3f4a858e69c: Pull complete 
489e4b1c5775: Pull complete 
1df65788cbec: Pull complete 
5f8cdcbd8fe9: Pull complete 
ef7b68c398c6: Pull complete 
2a3259313c42: Pull complete 
08b3543110eb: Pull complete 
Digest: sha256:f6ae3be0464c8e4a0558343f9de1123d1962c37574ff674ebdbb967deabbdd32
Status: Downloaded newer image for nvcr.io/nvidia/tensorflow:18.06-py3
nvcr.io/nvidia/tensorflow:18.06-py3

Regarding 3:
I’ve exported the API key as requested and downloaded the new 2.3.0 version of Riva QuickStart with following output:

server@AB-Server:/data$ ngc registry resource download-version "nvidia/riva/riva_quickstart:2.3.0"
Downloaded 85.86 KB in 7s, Download speed: 12.25 KB/s               
----------------------------------------------------
Transfer id: riva_quickstart_v2.3.0 Download status: Completed.
Downloaded local path: /data/riva_quickstart_v2.3.0
Total files downloaded: 30 
Total downloaded size: 85.86 KB
Started at: 2022-07-13 11:29:12.087344
Completed at: 2022-07-13 11:29:19.101574
Duration taken: 7s
----------------------------------------------------

Then I ran riva_init.sh again, but unfortunately this also led to the same error message again:

server@AB-Server:/data/riva_quickstart_v2.3.0$ sudo bash riva_init.sh
Please enter API key for ngc.nvidia.com: 
Logging into NGC docker registry if necessary...
Pulling required docker images if necessary...
Note: This may take some time, depending on the speed of your Internet connection.
> Pulling Riva Speech Server images.
  > Pulling nvcr.io/nvidia/riva/riva-speech:2.3.0-server. This may take some time...

Downloading models (RMIRs) from NGC...
Note: this may take some time, depending on the speed of your Internet connection.
To skip this process and use existing RMIRs set the location and corresponding flag in config.sh.
Error response from daemon: No such container: riva-models-download
Error in downloading models.

As I applied the changes to the new config file of the 2.3.0 version manually again, please find here the new config.sh as a reference, just to be sure I didn’t mess something up there: config.sh (14.1 KB)

Before I did all this I’ve run sudo apt update and sudo apt upgrade to ensure all packages are up-to-date.
All system updates succeeded without errors.

Not sure what could be special/wrong on the system here, really appreciating all your support.

Best regards
Andreas

Hi @ab-tools

Thanks for your interest in Riva,

My Apologies, I totally missed it, Just a quick check, in the logs shared above we find below usage

ngc registry resource download-version nvidia/riva/riva_quickstart:2.3.0

The above should be used for Server, Since AGX Orin is an embedded device we need to use

ngc registry resource download-version nvidia/riva/riva_quickstart_arm64:2.3.0

We need to follow instructions for Embedded in the riva docs

let us know if it works

Thanks for patience

Hello @rvinobha,

you are fully right, thanks a lot!
Indeed when using the different command now everything seems to work fine. :-)

Best regards and thanks again for your support
Andreas

1 Like

Thanks for trying out the suggestion - and I am happy it worked out.
So I am marking this thread as solved and it will automatically close.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.