Description
My laptop is kind of new: RTX 4080, ubuntu 22.04, and with GPU driver=525.147.05 and CUDA=12.0.
Yesterday I have installed the riva_quickstart_v2.13.1 (lastest). riva_init.sh/riva_start.sh both works very well. In config.sh I have enable the asr and tts with language_code only to “zh-CN” (Mandarin)
-
I have been used: GitHub - nvidia-riva/python-clients: Riva Python client API and CLI utils where it has the scripts/asr or scripts/tts.
-
for the ASR test, Riva did the supper job to recognize the Mandarin very well and very fast !
-
but for TTS, it seems to me that audio file is OK with single Chinese sentence. But when it combines multiple sentences: say 3 sentences, for the first sentence, it sounds correct but after that it sounds with Chinese local dialect. Not sure what is wrong? could you please help me to solve this problem on TTS of Mandarin?
I am attaching the python-clients.tar where I have used the scrips/tts/talk.py or talk_input.py. example command:
A. python3 talk.py --voice=Mandarin-CN.Female-1 --play-audio --server localhost:50050 --text “现在是上海交通大学计算机系大四在校学生”
It works very well for single input sentence
B. python3 talk_input.py -o output.wav --voice=Mandarin-CN.Female-1 --play-audio --server localhost:50050
Where I used the in.txt input file where is has three sentences. But from multi_sentence.wav generated, only the first sentence is standard Mandarin.
I try to attache the python-clients.tar and config.sh, but from this panel, the Upload icon does not show those two files. If you can show me how to upload these two files, that will be great!
Thanks so much for your help
James