JARVIS throwing errors for offline ASR when using own model


I’m trying to transcribe a large number of audio files using my own ASR model.

I’ve trained a Jasper model using nemo. The model works well if I test it using a simple program I wrote that loads the checkpoint i created using EncDecCTCModel.restore_from then call transcribe, but it’s slow (as expected).

My next step was to deploy the model to NEMO for offline transcription, but I can’t get that working. I can get streaming ASR to work using the transcribe_file.py example app, but the results are worse than my test program. Since I don’t need streaming right now, I tried to deploy the model in offline mode to see if the results were any better. I did this by adding the --offline option to the javis-build command. This created a bunch of offline models, but when I try to run transcribe_file_offline.py I get an error “Error: Model is not available on server”. I can see there are offline models in my “models” directory so I am unsure how to fix this. Any help would be much appreciated.


TensorRT Version:
GPU Type: T4
Nvidia Driver Version: 450.119.03
CUDA Version: 11.0
CUDNN Version:
Operating System + Version: Ubuntu 18.04.5 LTS
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): Docker nvcr.io/nvidia/jarvis/jarvis-speech:1.0.0-b.3-server

Relevant Files

The models are too large but the javis-build command created the following in my “models” directory.


Steps To Reproduce

Create the .enemo file from the nemo model

python ~/NeMo/scripts/export/convasr_to_enemo.py --nemo_file=/data/models/moneypenny.nemo --onnx_file=/data/models/moneypenny.onnx --enemo_file=/data/models/moneypenny.enemo

Start docker

docker run --gpus all -it --rm -v /data/models:/servicemaker-dev -v /data/jarvis:/data --entrypoint="/bin/bash" nvcr.io/nvidia/jarvis/jarvis-speech:1.0.0-b.3-servicemaker

Build the Jarvis models from the .enemo file

jarvis-build speech_recognition /servicemaker-dev/moneypenny-offline.jmir /servicemaker-dev/moneypenny.enemo --offline
jarvis-build speech_recognition /servicemaker-dev/moneypenny-streaming.jmir /servicemaker-dev/moneypenny.enemo

Run Jarvis


Test the streaming ASR (This works)

python3 ./transcribe_file.py --audio-file ~/test.wav

Test the offline ASR (This fails)

python3 ./transcribe_file_offline.py --audio-file ~/test.wav

Error output

Traceback (most recent call last):
File “./transcribe_file_offline.py”, line 62, in
response = client.Recognize(request)
File “/home/ubuntu/.local/lib/python3.6/site-packages/grpc/_channel.py”, line 946, in call
return _end_unary_response_blocking(state, call, False, None)
File “/home/ubuntu/.local/lib/python3.6/site-packages/grpc/_channel.py”, line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = “Error: Model is not available on server”
debug_error_string = “{“created”:”@1620636224.069156557",“description”:“Error received from peer ipv6:[::1]:50051”,“file”:“src/core/lib/surface/call.cc”,“file_line”:1067,“grpc_message”:“Error: Model is not available on server”,“grpc_status”:3}"

Hi @pete.hanlon
Could you please share the output of following command so we can help better

docker logs jarvis-speech

Meanwhile, could you please try just deploying the offline model instead of both versions?


Hi Sunil,

Thanks for getting back to me so quickly. I tried deploying just the offline model and that fixed the problem! To be honest I thought I tried that already but clearly I hadn’t.

I don’t need the streaming model at the moment but I will in the future. Should the streaming and the offline models coexist in a single JARVIS server or would you expect to deploy them onto separate servers?

Hi @pete.hanlon,

They can coexist, the deployed models have different names for that reason.
The server will choose the appropriate model depending on what the client requests as long as the GPU has enough memory to load both models.