Getting no transcript from Parakeet-CTC 0.6b model in Riva

Hardware - GPU T4
Operating System: Ubuntu 20.04
Riva Version: 2.18.0

I’m trying to take inference from https://huggingface.co/nvidia/parakeet-ctc-0.6b using Riva. For this, I use the following nemo2riva command

# nemo2riva=2.18.0
nemo2riva  --out <path to save .riva model> --onnx-opset 18 <path to .nemo model>

After this, I use the following Riva pipeline configuration to build Riva-specific files

riva-build speech_recognition \
  /path/to/my_model.rmir \
  /path/to/my_model.rmir \
  --name=parakeet-0.6b-en-US-asr-streaming \
  --return_separate_utterances=False \
  --featurizer.use_utterance_norm_params=False \
  --featurizer.precalc_norm_time_steps=0 \
  --featurizer.precalc_norm_params=False \
  --ms_per_timestep=80 \
  --endpointing.residue_blanks_at_start=-16 \
  --chunk_size=0.16 \
  --left_padding_size=1.92 \
  --right_padding_size=1.92 \
  --decoder_type=greedy \
  --greedy_decoder.asr_model_delay=-1 \
  --language_code=en-US 

But when I try to take inference using riva_streaming_asr_client.py on the built riva model, I’m always getting empty transcript.

Is there anything that I’m missing? Do you have any clue about it?

Thanks

Can you pls validate -
You did riva-build
riva-deploy

Also are you trying the inference on your sample audio or the provided ones within the sdk?

Thanks

Hi, @AakankshaS

Thanks for your reply

Yes, I did “riva-build” and “riva-deploy”

tried both “sample audio” and “provided one within the SDK”

It only works if I pass --nn.use_trt_fp32 during the riva-build command. Is it supposed to happen?

is it possible to riva-build with the --nn.fp16_needs_obey_precision_pass because latency with --nn.use_trt_fp32 is ~2x slow

Thanks