Hi, We are trying to deploy a Hindi Jarvis model for a live speech-to-text service [live streaming]. We have converted the Jarvis model from a Nemo Hindi model using the following steps:
AWS EC-2 : p2x.large.
GPU: Nvidia Tesla T4
Cuda version: 11.2
Nvidia driver version; 460.80
Note: In the second step, we have used 1.0.0-b.3 servicemaker because we are using Jarvis API of 1.0.0-b3 version.
Using the above steps we are able to convert the Nemo model to Jarvis models.
Conversion pipeline: Nemo → quartz.onnx → quartznet_asr.enemo → quartznet_asr.jmir → Jarvis models.
By giving the model location in the config file of jarvis api 1.0.0-b.3, we were able to start the model by running ./jarvis_start.sh. On testing with a sample hindi audio file, we are getting gibberish output.
Nemo model output and expected jarvis model output:
The current output of jarvis hindi model: