Jetson Xavier NX DevKit and Riva 2.10.0

Hi,

I have a Jetson Xavier NX DevKit device with JP 5.1 installed. I have loaded the Riva 2.10.0 Quck Start and am trying to use the Question Answering model working. I have changed the config.sh file in this way:

  1. Set this as per Quickstart:
    # Name of tegra platform that is being used. Supported tegra platforms: orin, xavier riva_tegra_platform="xavier"
  2. Not sure what to set this to (is this device tegra or non-tegra?):
    # GPU family of target platform. Supported values: tegra, non-tegra riva_target_gpu_family="non-tegra"
  3. Turning on just the NLP stuff only as per Quickstart:
    # Enable or Disable Riva Services service_enabled_asr=false service_enabled_nlp=true service_enabled_tts=false service_enabled_nmt=false
    When I set riva_target_gpu_family="non-tegra" and do a sudo bash riva_init.sh the system does a lot of work changing the preconfig models, but eventually fails at what I believe is the last step in the process.

When I set riva_target_gpu_family="tegra" and do a sudo bash riva_init.sh the system downloads the preconfig models and the script terminates normally. However, when I do a sudo bash riva_start.sh the container doesn’t come up. Doing a sudo docker logs riva-speech says that a process was terminated (can’t recall exact message but could provide the complete log, if desired).

I have successfully gotten Riva en-US ASR working on another NX DevKit and Riva en_US TTS working on yet another NX DevKit (I have 3) with the riva_target_gpu_family="tegra" in config.sh. I can’t seem to get more than one of these features working on the same NX or just NLP, ASR en_US to another language with or without this setting.

Hi @daniel.levine

Thanks for your interest in Riva

Apologies you are facing issues,

Jetpack 5.1 is the correct version for Riva 2.10,

I will check regarding this further with the team, request to kindly provide

  1. config.sh used
  2. complete log output of docker logs riva-speech as a text file in this thread

We also kindly request to check if the default runtime is set to nvidia on the Jetson platform by adding the following line in the /etc/docker/daemon.json file if not already done. Restart the Docker service using sudo systemctl restart docker after editing the file if not done earlier.
reference
https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#embedded

Once done, please verify by running docker run -it --rm --runtime nvidia ubuntu:20.04 and let us know if it works

Thanks

I did a riva_clean.sh (I kept the containers: riva-speech:2.10.0-l4t-aarch64 and riva-speech:2.10.0-servicemaker-l4t-aarch64) and removed all the models to start clean. riva-Init.sh finished successfully. Only one model was downloaded: models_nlp_intent_slot_misty_bert_base_v2.10.0-tegra-xavier. Imust admit that I was expecting something like distilbert-base-uncased to get pulled down too for the Q&A capability.

My intent is to just have the Q&A demos working in my Jetson Xavier NX DevKit so only the Riva NLP was enabled. I have attached my config.sh:
config.sh (12.8 KB)

Anyway, I ran riva_start.sh and after trying and retrying for a while it stops trying to bring the containers up. The results of docker logs riva-speech are attached:
NLP.tegra.log (1.7 KB)

I have already made that change to /etc/docker/daemon.json as a part of the Quick Start process. docker run -it --rm --runtime nvidia ubuntu:20.04 succeeds after downloading the Ubuntu container and giving me a root prompt in the container.

I should add that whenever I run the riva_*.sh scripts or docker commands, I need to use sudo or thet don’t succeed.

Note, I see this in the non-tegra path of config.sh:
### BERT Base Question Answering model fine-tuned on Squad v2. "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_bert_base:${riva_ngc_model_version}"
I was hoping that there would be something like this in the tegra path of config.sh:
### BERT Base Question Answering model fine-tuned on Squad v2. "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_distilbert_base:${riva_ngc_model_version}"

It would appear that the inability to connect to 127.0.0.1:8001 is not the reason it doesn’t work (as far as I can tell), because my other Jetson Xavier NX DevKit running only ASR en_US supporting models works just fine and spits out those messages to the log as well.

Hi @daniel.levine

Thanks for sharing the details,

I will check further with the internal team and provide update

Thanks

I added the below to the tegra NLP models to be downloaded to see what happens and turned on both the ASR and NLPs:

### BERT Base Question Answering model fine-tuned on Squad v2. "${riva_ngc_org}/${riva_ngc_team}/rmir_nlp_question_answering_bert_base:${riva_ngc_model_version}"

I ran riva_init.sh and all the expected models got downloaded from nvidia. It looked like it took the qa rmir (which is not what the tegra path normally gets) in stride and converted it to model_repository and the script finished. I then ran the riva_start.sh to see what happened. To my delight it ran for a while and then dropped me into the client testing container, which was a step forward. However running the client/riva_qa_nlp complained about the qa model. I can’ recall the error. I’m now working to see if I can figure out how to turn another qa model file ending in .riva from the nvidia model library into an rmir and then put that in the model_repository by hand and see if there’s any difference.

I also tried downloading this RIVA Question Answering | NVIDIA NGC

And followed this process to convert the encrypted .riva model to produce a rmir file. But in Phase 2 my riva-build fails like this:

riva-build qa /servicemaker-dev/rmir-questionanswering_squad_english_bert:tao-encode /servicemaker-dev/questionanswering_squad_english_bert.riva:tao-encode 023-04-10 19:58:28,508 [WARNING] Property 'encrypted' is deprecated. Please use 'encryption' instead. 2023-04-10 19:59:00,102 [INFO] Packing binaries for language_model/PyTorch : {'ckpt': ('nemo.collections.nlp.models.question_answering.qa_model.QAModel', 'model_weights.ckpt'), 'bert_config_file': ('nemo.collections.nlp.models.question_answering.qa_model.QAModel', 'bert-base-uncased_encoder_config.json')} 2023-04-10 19:59:00,103 [INFO] Copying ckpt:model_weights.ckpt -> language_model:language_model-model_weights.ckpt Traceback (most recent call last): File "/usr/local/bin/riva-build", line 8, in <module> sys.exit(build()) File "/usr/local/lib/python3.8/dist-packages/servicemaker/cli/build.py", line 102, in build rmir.write() File "/usr/local/lib/python3.8/dist-packages/servicemaker/rmir/rmir.py", line 159, in write newfact = art.create(name=target_name, content=fact.get_content(**cb), **props) File "<frozen eff.core.file>", line 307, in get_content File "<frozen eff.core.file>", line 269, in get_handle File "<frozen eff.core.file>", line 216, in decrypt File "<frozen eff.core.file>", line 236, in check_decryption PermissionError: The provided passphrase is invalid
So, I’m using the passphrase AFAIK for the model for both the input and the output (even though I probably don’t need to do this for the output).

Haven’t gotten past this yet to move on to the rest of Phase 2 and Phase 3 of the procedure. :-/

I found that config.sh uses tlt_encode and used that in the above command instead of tao-encode and got further, so I’m guessing the nvidia model documentation is out of date. :-/

So now riva-build finishes. Yay!

So now I did the next step:

riva-deploy /servicemaker-dev/rmir-questionanswering_squad_english_bert:tlt_encode /data/models
But that fails with:

2023-04-10 21:05:17,978 [INFO] Writing Riva model repository to '/data/models'... 2023-04-10 21:05:17,979 [INFO] The riva model repo target directory is /data/models 2023-04-10 21:05:49,808 [INFO] Using obey-precision pass with fp16 TRT 2023-04-10 21:05:49,809 [WARNING] /data/models/riva-trt-riva_qa-nn-bert-base-uncased already exists, skipping deployment. To force deployment rerun with -f or remove the /data/models/riva-trt-riva_qa-nn-bert-base-uncased 2023-04-10 21:05:49,810 [WARNING] /data/models/qa_tokenizer-en-US already exists, skipping deployment. To force deployment rerun with -f or remove the /data/models/qa_tokenizer-en-US 2023-04-10 21:05:49,810 [WARNING] /data/models/qa_qa_postprocessor already exists, skipping deployment. To force deployment rerun with -f or remove the /data/models/qa_qa_postprocessor 2023-04-10 21:05:49,811 [INFO] Extract_binaries for self -> /data/models/riva_qa/1 2023-04-10 21:05:49,812 [ERROR] Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/servicemaker/cli/deploy.py", line 100, in deploy_from_rmir generator.serialize_to_disk( File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 447, in serialize_to_disk RivaConfigGenerator.serialize_to_disk(self, repo_dir, rmir, config_only, verbose, overwrite) File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/triton.py", line 314, in serialize_to_disk self.generate_config(version_dir, rmir) File "/usr/local/lib/python3.8/dist-packages/servicemaker/triton/nlp.py", line 404, in generate_config tokenizer._inputs[0].name: "IN_QUERY_STR__0", AttributeError: 'RivaQATokenizer' object has no attribute '_inputs'
This may have been what the original rmir I downloaded failed with as well. It felt like the same kind of thing.

I’m going to try this, since the error seems similar, except their problem was with ASR and mine is with QA: https://forums.developer.nvidia.com/t/riva-speech-skills-initialisation-error-rivatokenizer-object-has-no-attribute/

After doing a riva_clean.sh, ensuring that NLP is the only capability set to true in config.sh and with the BERT QA RMIR added to the tegra path in riva_init.sh and all docker run commands in the riva_*.sh scripts now have --privileged added to them.

The riva_init.sh finished silently, but didn’t populate model_repository/models with anything. So I went beack to where Phase 1, 2, and 3 of the by hand method are described. All I needed to do was a riva_deploy, which resulted in:

ModelOutput(name="SEQ__0", data_type=self.model.input_ids_type, dims=[max_seq_length]), AttributeError: 'QuestionAnswering' object has no attribute 'input_ids_type'

Hey there, has anyone reproduced my issue?

Hi @daniel.levine

I have created an internal thread and provided all the information, once they provide inputs/updates i will reply back

Thanks

Thanks.

I have also tried the AMDx64 version of Riva 2.10.0 to see how it behaves. I was able to get the en_US ASR to work and TTS stuff to work just like on the Jetson Xavier NX. However the BERT NLP QA won’t allow the riva_start.sh to come up. Similarly, the en_es NMT ASR capability also prevents riva_start.sh from coming up.

However, I was enabled the NLP Megatron QA model in the script instead of the BERT QA one and it did come up and I was able to exercise the demo using it on the AMDx64 platform. I tried to reproduce the result on the Jetson Xavier NX, but it doesn’t look like to converted the rmir for the model to anything as fast as I can tell. riva_start.sh doesn’t come up, I believe, because there are no models.