Audio2Face with Riva TTS Extension

Hi.
By following the demo video, I’m able to setup Riva service and make the avatar talk based on the text typed in the “Audio2Face Riva TTS Extension” in Audio2Face.

My question is, once the extension is correctly setup in Audio2Face(localhost:50051), are all the TTS requests that I send via the Riva endpoint ( localhost:50052) with online streaming mode would be directly routed to A2F endpoint without any audio data coming back? which means, I won’t get any value from variable resp in the following code?

auth = riva.client.Auth(uri=RIVA_SPEECH_URI)
tts_service = riva.client.SpeechSynthesisService(auth)
sample_rate_hz = 44100
req = { 
        "language_code"  : "en-US",
        "encoding"       : riva.client.AudioEncoding.LINEAR_PCM ,   # LINEAR_PCM and OGGOPUS encodings are supported
        "sample_rate_hz" : sample_rate_hz,                          # Generate 44.1KHz audio
        "voice_name"     : "English-US.Male-1"                    # The name of the voice to generate
}
req["text"] = "Hi, how are you?"
resp = tts_service.synthesize_online(**req)
empty = np.array([])

for i, rep in enumerate(resp):
    audio_samples = np.frombuffer(rep.audio, dtype=np.int16) / (2**15)
    empty = np.concatenate((empty, audio_samples))

Here’s the response from our engineering team. Please let us know if you need further assistance.

  • with the current rivs_tts extension, yes it directly pass received audio to a2f streaming player.
  • to change url, users can call extension’s methods, set_riva_url(url) and set_a2f_player_url(url)
  • to get riva audio to debug: a. change url to user’s local audio bridge service. OR b. change tts_client to save audio on top of sending to a2f player

Hi @Ehsan.HM
Thanks for the reply. I’m getting closer to it.

May I know how to call extension’s methods? Is there any doc or sample code that I could take reference?

By using streaming mode as follows,
image

here is the setup:

I noticed that /World/audio2face/CoreFullface is not the default one to be used for A2F streaming, but /World/audio2face/CoreFullface_01 is. Is that an issue?

When AudioPlayer is playing audio, the face is not moving, is that an issue?

I could trigger A2F streaming successfully by typing text in extension, but when I use the following code to invoke TTS service, A2F streaming is not reacting.
resp = tts_service.synthesize_online(**req)

I’m very sure that TTS server receives the request because I checked the log on the server.

Do you have any idea how to address this problem?

Generally we should only have one CoreFullFace node. Having 2 will complicate things and might be the reason face is not moving. The wrong one might be driving the face.

Here’s how you can call extension methods:

import omni.audio2face.riva_tts
ext = omni.audio2face.riva_tts.get_ext()
ext.set_riva_url(url) or ext.set_a2f_player_url(url)

Thanks @Ehsan.HM

I reinstall A2F and set it up. It’s getting better now. Only one CoreFullFace is there. This time, I directly go to streaming mode.
image

However, after setting the URLs in Riva_TTS extension, I still cannot get A2F triggered by invoking Riva TTS API using resp = tts_service.synthesize_online(**req), but I can trigger A2F by typing the text in the extension.
I do observed that Riva TTS API is invoked successfully through its log, not sure why A2F is not triggered.

Question 1: Is there anything I missed out to get A2F triggered? My understanding is that A2F should be triggered after the URLs are setup correctly in the extension and Riva TTS API is invoked.

Question 2: I got an error when running the following code, is there any library that I should pip install first?

import omni.audio2face.riva_tts
ext = omni.audio2face.riva_tts.get_ext()
ext.set_riva_url(url) or ext.set_a2f_player_url(url)

Question 3: Could I use this library omni.audio2face.riva_tts to

  • set the desired text to be sent

  • and send it?

The code above only shows how to set URLs in the extension.