Hardware - GPU RTX 4090
Operating System: Linux 22.04.5
Riva Version: 2.18.0
I wanted to know if it would be possible to force the model to work exclusively in a single language. I’m currently facing an issue where, during streaming transcription for Brazilian Portuguese (pt-BR), the model mixes Russian, English, and Brazilian Portuguese. The model’s performance has been terrible.
Transcription example:
“Посторечо пицца, porfvoр бризл tá começando a falar russo, e não volta”
Unfortunately, this support is not available, but we are working on this. Will be available in future releases.
@fernandovidal8878901 Can you share which of the following use-cases are relevant:
- batch processing / streaming
- Audio is in multiple languages → output transcription to a language which is different than the audio languages.
- Audio is in single language → output transcription in the a language that was spoken. Using multilingual model to be ready for multiple languages in different sessions.
- Audio is in multiple languages → transcribe (no translation) of each language based on what’s spoken.
- is there a need for a LID meta-data ?
- Languages known in advance (forced) vs auto-detect audio langs.
- Other use-cases of interest…
Thank you for the feedback! I’m excited about it.
For me, the following use-cases are relevant:
- batch processing / streaming
- Audio is in single language → output transcription in the a language that was spoken. Using multilingual model to be ready for multiple languages in different sessions.
- Languages known in advance (forced) vs auto-detect audio langs.
Furthermore, I am very interested in transcription models for Brazilian Portuguese, mainly in real time.
1 Like