Has the known issue from Riva 1.7.0 beta:
- The punctuation pipeline does not support unicode character input. This will be fixed in the next release.
been resolved?
I am working with Riva 1.8.0 beta and NeMo:1.6.0rc0 on a non-english ASR+punctuation pipeline. I am able to convert and deploy both, the ASR and punctuation model. However, when I run the examples/transcribe_file.py
the Riva/Triton log prompts:
E0122 23:35:30.347671 326 grpc_riva_asr.cc:231] ITN not supported for language: __
W0122 23:35:30.347699 326 grpc_riva_asr.cc:241] Punctuation not supported for __ language
regardless if I built the punctuation.rmir
with --language_code __
or not.
If I register the ASR and punctuation as en-US
, however portions of the resulting transcription are dropped whenever punctuation is enabled. Note that outside of Riva the punctuation model works as it should. The only thing is that the language uses some utf-8 characters.