TTS on Jarvis generates long strange sounds after ending the sentence

time.not.wait.me · May 21, 2021, 10:04am

Hi, I trained Tacotron2 in Thai language using NeMo and deploy it to Jarvis. The result with NeMo is fine, but Javis generate long strange sounds after ending the sentence.

Example

Input sentence : “ทำ ไร กัน อยู่ กิน ข้าว กิน ปลา รึ ยัง”
Pronunciation in English: “tam rai gun yu kin khaew kin pla rue yang”
This should end in two seconds.

I used jarvis version 1.1 beta.

Another question, Is it support other TTS models? such as FastSpeech, FastPitch. The document shows only an example of Tacotron2.

SunilJB · May 28, 2021, 5:02am

Hi @time.not.wait.me ,
Could you please share the Nemo model, script and log files so we can help better?

Thanks

time.not.wait.me · June 1, 2021, 8:03am

jarvis-service-maker logs:
jarvis_service_maker.txt (5.4 KB)

jarvis-server logs:
jarvis_server.log (82.5 KB)

time.not.wait.me · June 1, 2021, 8:10am

nemo model

test script
jarvis_tts_TEST.ipynb (154.9 KB)

SunilJB · June 10, 2021, 3:24pm

Hi @time.not.wait.me
This is a known bug due to Tacotron2 not having an explicit duration model. The model has to “decide” to stop generating, and sometimes it does never happen, causing the model to generate those strange sounds after it finishes generating the input.
Since we cannot predict how long the sentence will be, this happens (especially in models not trained long enough or on small datasets).
Explicit duration model support will be added in future release.

Thanks

Topic		Replies	Views
Jarvis Hindi model giving gibberish output Riva riva	4	964	October 9, 2021
Nemo Trained model not giving transcript when deployed on jarvis both offline and streaming Riva nemo , riva	6	1044	September 8, 2021
Jarvis: Triton server died before reaching ready state. Terminating Jarvis startup Riva riva	6	2161	October 12, 2021
Getting a Real Time Factor Over 60 for Text-To-Speech Services Using NVIDIA Jarvis Technical Blog	0	433	August 25, 2020
JARVIS throwing errors for offline ASR when using own model Riva riva	12	2923	September 4, 2021
Waiting for Jarvis server to load all models...retrying in 10 seconds Riva riva	7	2668	April 30, 2021
[TLT3.0][Jarvis] Fine tuning Quartznet produces garbled transcript Riva riva	7	1001	October 12, 2021
Jarvis Installation Issue: "Waiting for Jarvis server to load all models...retrying in 10 seconds" when running sudo bash jarvis_start.sh Riva riva	3	1655	July 8, 2021
Trying to run jarvis Riva riva	3	867	March 18, 2021
Init. Jarvis with german model Riva riva	9	1522	November 4, 2021

TTS on Jarvis generates long strange sounds after ending the sentence

Related topics