Help Needed: Riva ASR Model Not Detecting Audio

thehuy22032000 · July 25, 2024, 10:47am

Hi everyone,

I’m working on deploying an ASR model using NVIDIA Riva and encountering an issue where the model is not detecting any audio. I’m following the steps below to build and deploy the Riva model, but when I process an audio file, it seems like no audio is being recognized. Here are the steps and commands I used:
1. Building the Riva ASR model:

bash

riva-build speech_recognition \
   /data/rmir/Parakeet_ctc_xxl.rmir \
   /data/Parakeet-CTC-XXL-1.1b_spe13k_em-ea_1.0.riva \
  --offline \
  --name=parakeet-1.1b-unified-ml-cs-em-ea-asr-offline \
  --return_separate_utterances=True \
  --featurizer.use_utterance_norm_params=False \
  --featurizer.precalc_norm_time_steps=0 \
  --featurizer.precalc_norm_params=False \
  --ms_per_timestep=80 \
  --nn.fp16_needs_obey_precision_pass \
  --unified_acoustic_model \
  --chunk_size=4.8 \
  --left_padding_size=3.2 \
  --right_padding_size=3.2 \
  --featurizer.max_batch_size=256 \
  --featurizer.max_execution_batch_size=256 \
  --decoder_type=greedy \
  --greedy_decoder.asr_model_delay=-1 \
  --language_code=em-ea

2. Deploying the Riva model:

bash

riva-deploy -f  \
        /data/rmir/Parakeet_ctc_xxl.rmir \
        /data/models/

Issue:

When calling the ASR service to process an audio file, I get the following log output, which indicates that no audio was detected:

I0725 09:14:10.128805 243 grpc_riva_asr.cc:678] ASRService.Recognize called. I0725 09:14:10.132581 243 grpc_riva_asr.cc:863] Using model parakeet-1.1b-unified-ml-cs-em-ea-asr-offline-asr-bls-ensemble from Triton localhost:8001 for inference I0725 09:14:10.233958 243 stats_builder.h:100] {"specversion":"1.0","type":"riva.asr.recognize.v1","source":"","subject":"","id":"04974b4b-76b2-4a7c-a941-ecd0498ce5d3","datacontenttype":"application/json","time":"2024-07-25T09:14:10.12875964+00:00","data":{"release_version":"2.16.0","customer_uuid":"","ngc_org":"","ngc_team":"","ngc_org_team":"","container_uuid":"","language_code":"em-ea","request_count":1,"audio_duration":0.0,"speech_duration":0.0,"status":0,"err_msg":""}}
It seems like the ASR service is being called, but it reports a duration of 0.0 seconds for both audio and speech. I’ve verified that the audio file is in the correct format and has content.
Questions:

Has anyone encountered a similar issue with Riva ASR models not detecting audio?
Are there any common misconfigurations or steps I might have missed in the model building or deployment process?
Could there be an issue with the audio file or the way it’s being processed?

Any guidance or suggestions would be greatly appreciated!

Thanks in advance for your help.

serhii-artemuk · April 22, 2025, 12:43pm

Try to add this line in riva-build script --nn.use_trt_fp32

Topic		Replies	Views
Riva ASR Not Recognizing Speech (Empty Transcript) Riva riva	4	84	March 12, 2025
Attempt to transcribe audio file fails (detected audio length is 0) Riva	2	441	February 3, 2024
NVIDIA Riva ASR failed start with WFST decoders Riva riva	5	516	March 29, 2024
Arabic ASR using riva throws error - "Error: Unavailable model requested given these parameters: language_code=ar; sample_rate=16000; type=offline; " Riva nemo , riva	0	32	February 25, 2025
Final transcripts showing empty transcription Riva python	6	557	November 2, 2022
RIVA ASR StreamingRecognition low confidence for word transcripts Riva	1	493	November 29, 2023
Riva v2.19 speaker diarization issue Riva riva	3	74	April 24, 2025
Riva ASR issue on transcribing demo audio Riva riva	3	618	April 25, 2023
Riva waiting for Triton server to load all models...retrying in 1 second Riva riva	2	1004	March 22, 2023
Issue Deploying Fine-Tuned Arabic Conformer Model in NVIDIA Riva: No Transcriptions Returned Riva	0	67	December 1, 2024

Help Needed: Riva ASR Model Not Detecting Audio

Related topics