[BUG] Riva 1.8.0 punctuation pipeline

ilb · January 22, 2022, 11:54pm

Has the known issue from Riva 1.7.0 beta:

The punctuation pipeline does not support unicode character input. This will be fixed in the next release.

been resolved?

I am working with Riva 1.8.0 beta and NeMo:1.6.0rc0 on a non-english ASR+punctuation pipeline. I am able to convert and deploy both, the ASR and punctuation model. However, when I run the examples/transcribe_file.py the Riva/Triton log prompts:

E0122 23:35:30.347671   326 grpc_riva_asr.cc:231] ITN not supported for language: __
W0122 23:35:30.347699   326 grpc_riva_asr.cc:241] Punctuation not supported for __ language

regardless if I built the punctuation.rmir with --language_code __ or not.

If I register the ASR and punctuation as en-US, however portions of the resulting transcription are dropped whenever punctuation is enabled. Note that outside of Riva the punctuation model works as it should. The only thing is that the language uses some utf-8 characters.

ilb · January 23, 2022, 3:25pm

In the official Riva 1.8.0 documentation under the section “Inverse Text Normalization” one can read “Currently, the grammars are limited to English. In a future release, additional information on training, tuning, and loading custom grammars will be available.” Is there any roadmap to when this additional information will be available? We would love to develop the ITN for our language of choice, but we’d need more information to when and how will this be supported under Riva. Even a heads up that it is/will be based on NeMo/text_processing would be a good first step.

Topic		Replies	Views
[Bug] v1.10 Offline Transcripts - punctuation model breaks pipeline output Riva inception	10	1320	June 13, 2022
[BUG] Riva deploy NeMo trained Punctuation_and_Capitalisation Riva	0	515	January 20, 2022
Nvidia Riva empty inference on Punctuation and Capitalization model Riva inference-server-triton , nlp , riva	4	1015	November 14, 2022
Riva ASR transcript cut off? Riva	11	1366	March 20, 2022
Words joined together in transcription Riva	2	469	February 2, 2023
RIVA TTS server doesn't enunciate some words Riva	1	70	November 28, 2024
Canary 1b producing 'x's as transcription on Arabic audio Riva	5	51	January 23, 2025
[Question]: Roadmap for NMT support in Riva Riva tensorrt , nemo , riva	1	882	January 25, 2022
RIVA en-US when using LM, interim results with stability change drop already predicted but less stable words Riva tensorrt , nvbugs , nemo , riva	9	1312	March 28, 2023
Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT Technical Blog	1	71	February 20, 2025

[BUG] Riva 1.8.0 punctuation pipeline

Related topics