No ASR text output after building riva-build to use en-GB, and the running riva-start

Nvidia forum thinks there’s links in my post (because in explaining the problem there are IP addresses to deal with (or some other parsing error!). So I converted the post to a shared google text file.

Please see the file: Nvidia_post_093022.txt - Google Drive

Here’s a link to the docker logs riva-speech

Hi @edmond4

Apologies for the delay

This issue will be fixed in future

Now for workaround, Can we try the flag --nn.use_trt_fp32 in riva build command and let us know if it works

Thanks for your patience

Hi do I need something similar in riva_deploy or only riva_build.
I’m getting lots of warnings in riva_deploy

[10/05/2022-19:00:45] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[10/05/2022-19:00:45] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[10/05/2022-19:00:45] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[10/05/2022-19:00:45] [TRT] [W] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped

With the last line repeated about 50 times or so…

I was able to run riva_build.sh and riva_deploy.sh
with the new flag in riva_build --nn.use_trt_fp32
(See previous post for full script).
However, I’m now having a problem with riva_start.sh - not sure why?
(Script included here and hsa not changed from previous post.

Below is a link with the following:

  1. riva_start.sh
  2. directoy listing of /data
  3. docker logs riva-speech

One question is where do I need to keep the .rmir file generated by riva_build?
To be sure I made a copy in /data and /data/models

Here’ the link:

so earlier in another thread you asked:

@petra1 I saw that also - I don’t know where to get that from or how to make it. I did notice that I do have that file for the en-us conformer model (i.e. the “oob” conformer model).

Any idea if I can download something from the NGC catalog and pass an option in riva-build to make the file or if I need to find that file from somewhere in the NGC catalog? Any leads would be appreciated!

So that file (model.plan) should be created from the rmir file during the deploy. Could you cleanup your models directory, make sure that your rmir is inside the rmir directory and rerun riva_init.sh. The logs from the riva_init command could give some clues as to why this model.plan file is missing.

Also if you have logs from running your riva_build command, those would be handy too - to see if the rmir build worked as expected.

In my previous experiments, I did succeed in creating a model with just custom language model from the base acoustic model, similar to what you’re trying to do. One difference is though that I didn’t change the --language_code=en-GB.

I run init only for the out of the box (riva_quickstart) and that works fine for en-US.

Here I’m trying to change the language to be en-GB, so I first run riva-build to make the rmir file from some existing lm files and .far files.
After riva-build I run riva-deploy which makes the models directory. My rmir file (created by riva-build) is not in any subdirectory.
Does the rmir file need to be in it’s own directory?
Also I don’t run riva_init.sh or riva_start.sh from riva_quickstart, I run a riva_start.sh (included earlier in the post)

If there’s a way I can show you what I’m doing it might make more sense.

Thanks!

So, I just tried putting the rmir file in its own directory, same output of riva_start.sh

here’s my riva_build.sh script:

Does the rmir file need to be in it’s own directory?
When using riva_init.sh, rmir dir is where the script looks for the rmir files. But you’re using riva-deploy directly (which I haven’t tried) and I my guess is that a valid path to the rmir file should be enough.

Also I don’t run riva_init.sh or riva_start.sh from riva_quickstart, I run a riva_start.sh (included earlier in the post)
I see, is there any reason in particular for not using the quickstart scripts?

I’m flexible to try anything that works.
But the output of the riva-build (which I need to use the en-GB language model) is only a .rmir file.

At that point I run a one-liner riva-deploy (which is run inside servicemaker container) + a simplified riva-start (run on the host, as I have it).
I’m not sure if/how the quickstart riva_start.sh script should be modified (meaning do i run it instead of riva-deploy or after riva-deploy).

But the output of the riva-build (which I need to use the en-GB language model) is only a .rmir file.
do you mean that you didn’t see any logs? When I run the command, I can see the logs.

Also what logs can you see when running riva-deploy (after erasing the existing models dir to restart the deploy fully).

Hi,
Below is a link with the log of riva_build.sh (sh script shared previously)
and the log of riva_deploy.sh (sh script shared previously)
same error occurs when I run riva_start.sh (script shared previously)
output of docker logs riva-speech also shared previously.

Let me know what you think. I’m happy to run this interactively with you…

https://drive.google.com/file/d/1aLxn6_zz3MB7RksUzbExg8V7zA-j0Ri3/view?usp=sharing

--nn.fp16_needs_obey_precision_pass \

is one flag that’s missing in your build command but I’m using in mine (and is used throughout the documentation).

We can schedule a chat to troubleshoot if this does not help or nvidia moderators don’t provide a hint in the meantime.

Hi @edmond4

Apologies for the delay,

Thanks for sharing the logs,

Request to share the complete riva-build command

Thanks so much @petra1 for your kind inputs and help

Thanks

Thanks @petra1 @rvinobha

There’s still a bug somewhere. I noticed the following:

  1. Riva deploy shows the following errors (I don’t know if/how/what I need to rebuild for the triton server to work properly - or how to check it).
    [10/18/2022-07:23:31] [TRT] [E] 4: [network.cpp::validate::2787] Error Code 4: Internal Error (fp16 precision has been set for a layer or layer output, but fp16 is not config
    ured in the builder)
    [10/18/2022-07:23:31] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )

  2. Riva deploy output also has some warnings of “One or more weights outside the range of INT32 was clamped”

  3. docker logs riva-speech output says:
    “Triton server died before reaching ready state. Terminating Riva startup.
    Check Triton logs with: docker logs” - This is consistent with the riva-deploy error.

My question is how do I rebuild/restart/check the triton server is running correctly for riva-build/deploy to work properly?

FYI:
Below is a link to a shared document with the following:

  1. Riva build command
  2. Riva build log output
  3. Riva deploy command
  4. Riva deploy log output
  5. Riva start command
  6. Riva start log output
  7. docker logs riva-speech output

Thanks @petra1.

There’s still a bug. Using the flag didn’t change anything in riva-build.
I’m thinking the problem is with the triton server (either not running or built properly) based on the error notice from
riva-deploy.

Take a look at the link in the recent post https://forums.developer.nvidia.com/t/no-asr-text-output-after-building-riva-build-to-use-en-gb-and-the-running-riva-start/229473/16

Would definitely appreciate scheduling a chat to troubleshoot!

Thanks!
-edmond

Hi @edmond4

Thanks for sharing the requested details,

I will review all the details and get back to you by this week,

Thanks for your patience

I had a similar issue as described in this thread: I fine tuned a custom model which returned empty transcripts when deployed, and the --nn.use_trt_fp32 fixed the issue. However, the fine tuned model with fp32 transcribes 50% slower than the base model (which I’m assuming was using fp16). Is this expected behavior? If so, do you have a timeline for when this issue will be fixed? Inference speed is critical for my application.

HI @edmond4

Please share your finetuned en-GB model used via email (sent an email for same)

Thanks

Hi @david.kaleko

The internal team are actively working to solve the issue,
regarding the timeline for fix I will check with my team and let you know

Thanks