A2F to Unity streaming

Hey all,

I am trying to build a pipeline that feeds audiotoface data directly into unity.
I’m talking an animated avatar that gets a speech audio file and talks live to you.
I have already set up a facial test in a2f which works great, i’m just unsure how the pipeline works towards unity streaming.
Any help will be greatly appreciated.
Cheers, Kurt

Hello @soontekk! I recommend that you join our discord community at discord.gg/nvidiaomniverse. You can chat with other community members on the best way to setup your A2F stream! In the meantime, I will forward this over to the dev team for some help!

We also have a tutorial video that may help here:

Hi Wendy,
I’ll join the discord for sure and check the doc too.
Thank you so much.

Currently, this is not possible. Thanks for bringing this up. We will discuss this internally.

Thank you Ehsan, sorry for the late reply i thought i replied but seems not so ;)

Just had a meeting with the team where we are still researching the unity and unreal routes.
Our dev found that it is not possible to feed audio into the A2F streaming to feed our avatars.

His words:

After looking into Nvidea’s audio2face, we discovered that it is presently not compatible with streaming into Unity. The missing component needed for streaming is the blendpose value keys, which the API is unable to retrieve. Nevertheless, we found a workaround by manually adding code to the facsSolver.py file to send the blendpose keys into Unity. This solution only works when audio is played through audio2face, a feature that is unfortunately not supported from the API.

Is there any way we can access this information in a method that is automatable?

Thanks @soontekk

There are 2 steps to resolve this:

  • Audio2Face needs to stream the weights
  • A code inside Unity to receive the stream similar to Unreal Engine.

We’ll discuss adding Play/Stop API internally.

Brilliant, thank you @Ehsan.HM

This functionality is the holy grail for metaverse/gaming, totally reactive NPCs, leave your own Avatar on in the game 24x7 because it has trained on your prior experience and transfer learned from an LLM. Audio2Face becomes the most powerful tool in the world when this problem is solved and public.

There is of course the Nvidia ACE and Tokkio programs, but I have not had a response from Nvidia on accessing them.

Any ready-made solution for live streaming for audio face in local machine?

Except [NVIDIA Omniverse Avatar Cloud Engine (ACE),


I’m not entirely sure what you mean, but if you’d like to generate animation from live stream audio on your machine you can follow this tutorial:

Overview of Streaming Audio Player in Omniverse Audio2Face - YouTube

Thanks for your reply.

I have some questions about below link you provided.

In this video, does this A2F model be accelerated by TensorRT ?
If does, how much times be accelerated by TensorRT ?

Overview of Streaming Audio Player in Omniverse Audio2Face - YouTube

Yes the solve is built using TensorRT, but since we lack a CPU implementation, we can’t precisely measure the extent to which TensorRT accelerates computations. That said you can see the time in the UI.

image (1)