Riva ASR not returning the final transcipt accurately

bikramjeet.nath · April 20, 2025, 5:22pm

I am using Nvidia Jetson Orin Developer Kit and I want to use the Riva ASR for doing STT for en-us language. I have referred this document Speech Recognition — NVIDIA Riva to download the riva_quickstart_arm64:2.19.0. Below is the default model configuration from the config.sh with which I have run the riva_init.sh followed by riva_start.sh which executed successfully.

riva_target_gpu_family=“tegra”

riva_tegra_platform=“orin”

service_enabled_asr=true
service_enabled_nlp=true

asr_acoustic_model=(“conformer”)

asr_language_code=(“en-US”)

asr_accessory_model=(“”)

Followed by that I have downloaded and installed the python-client sdk from here GitHub - nvidia-riva/python-clients: Riva Python client API and CLI utils. Post that I am using the transcribe_mic.py from here python-clients/scripts/asr at main · nvidia-riva/python-clients · GitHub for ASR. I am getting the english transcript but it is not accurate.

Below is my observation -

While I speak the intermediate transcript shows the correct text in the terminal but the final text output misses out on few words in between
There is a flag called verbatim_transcripts in the config which if set to true is supposed to return the transcript text as is without any inference but doesn’t seem to be working as expected and inference is still applied and in between words are missing in the final output

What are the settings wherein I need to configure and fine tune so that I am able to get the exact transcript as spoken in English without missing out any word in between for real time transcript as the conversation is happening?

AakankshaS · June 9, 2025, 10:35am

Hi @bikramjeet.nath ,
This is likely a known issue and is already on roadmap to be fixed.
The fix will be vailablein upcoming future release. Please stay tuned with the release notes.
However you can help us understand the criticality of this request.

Thanks

Topic		Replies	Views
Riva ASR transcript cut off? Riva	11	1331	March 20, 2022
Riva ASR issue on transcribing demo audio Riva riva	3	629	April 25, 2023
Finetuned ASR conformer returns only empty transcripts Riva	13	985	October 20, 2022
Final transcripts showing empty transcription Riva python	6	567	November 2, 2022
Unable to get interim transcripts for RIVA Unified Conformer ASR Japanese model Riva riva	1	46	April 15, 2025
Riva on Whisper Large v3 returns only part transcription Riva	7	104	January 23, 2025
Issue Deploying Fine-Tuned Arabic Conformer Model in NVIDIA Riva: No Transcriptions Returned Riva	0	69	December 1, 2024
Riva providing empty transcriptions for a few audios, but nemo does not for those audios Riva python , nemo , riva	4	865	November 21, 2022
Wrong outputs from our fine-tuned version of speechtotext_english_citrinet_1024.tlt after deploying using riva_init.sh Riva inception	3	786	August 12, 2022
RIVA Non-reproduicable ASR outputs compared to NeMo model Riva	2	347	June 18, 2024

Riva ASR not returning the final transcipt accurately

Related topics