Please provide the following information when requesting support.
Hardware - GPU RTX 3090
Hardware - CPU AMD EPYC 7502
Operating System Ubuntu 22.04
Riva Version v2.14.0
How to reproduce the issue ? (This is for errors. Please share the command and the detailed log here)
Hello, my goal is deploy custom ASR model trained on domain-specific data with help of RIVA.
I can run inference with built model, though ASR results are much worse then those I obtain in plain NeMo inference scripts.
I noticed that quality is affected significantly by
--chunk_size=4.8 \
--left_padding_size=0.0 \
--right_padding_size=0.0 \
Do you have any suggestions on how riva-build arguments list should look like to get ASR outputs closest to those I have in NeMo?
I’ll be happy to obtain the same greedy outputs equivalent to NeMo’s to start with.
Here is a full list of my riva-build arguments.
riva-build speech_recognition \
/data/rmir/asr2.rmir \
/data/conformer-finetune-inhouse-more-data-golos-best.riva \
--offline \
--name=conformer-ru-RU-asr-offline \
--return_separate_utterances=False \
--featurizer.use_utterance_norm_params=False \
--featurizer.precalc_norm_time_steps=0 \
--featurizer.precalc_norm_params=False \
--ms_per_timestep=40 \
--endpointing.start_history=200 \
--nn.fp16_needs_obey_precision_pass \
--endpointing.residue_blanks_at_start=-2 \
--chunk_size=4.8 \
--left_padding_size=0.0 \
--right_padding_size=0.0 \
--max_batch_size=16 \
--featurizer.max_batch_size=512 \
--featurizer.max_execution_batch_size=512 \
--decoder_type=greedy \
--greedy_decoder.asr_model_delay=-1 \
--language_code=ru-RU
Thank you. Vic