Hardware - GPU T4
Operating System: Ubuntu 20.04
Riva Version: 2.18.0
I’m trying to take inference from https://huggingface.co/nvidia/parakeet-ctc-0.6b using Riva. For this, I use the following nemo2riva command
# nemo2riva=2.18.0
nemo2riva --out <path to save .riva model> --onnx-opset 18 <path to .nemo model>
After this, I use the following Riva pipeline configuration to build Riva-specific files
riva-build speech_recognition \
/path/to/my_model.rmir \
/path/to/my_model.rmir \
--name=parakeet-0.6b-en-US-asr-streaming \
--return_separate_utterances=False \
--featurizer.use_utterance_norm_params=False \
--featurizer.precalc_norm_time_steps=0 \
--featurizer.precalc_norm_params=False \
--ms_per_timestep=80 \
--endpointing.residue_blanks_at_start=-16 \
--chunk_size=0.16 \
--left_padding_size=1.92 \
--right_padding_size=1.92 \
--decoder_type=greedy \
--greedy_decoder.asr_model_delay=-1 \
--language_code=en-US
But when I try to take inference using riva_streaming_asr_client.py on the built riva model, I’m always getting empty transcript.
Is there anything that I’m missing? Do you have any clue about it?
Thanks