I am trying to build a pipeline that feeds audiotoface data directly into unity.
I’m talking an animated avatar that gets a speech audio file and talks live to you.
I have already set up a facial test in a2f which works great, i’m just unsure how the pipeline works towards unity streaming.
Any help will be greatly appreciated.
Cheers, Kurt
Hello @soontekk! I recommend that you join our discord community at discord.gg/nvidiaomniverse. You can chat with other community members on the best way to setup your A2F stream! In the meantime, I will forward this over to the dev team for some help!
Thank you Ehsan, sorry for the late reply i thought i replied but seems not so ;)
Just had a meeting with the team where we are still researching the unity and unreal routes.
Our dev found that it is not possible to feed audio into the A2F streaming to feed our avatars.
His words:
After looking into Nvidea’s audio2face, we discovered that it is presently not compatible with streaming into Unity. The missing component needed for streaming is the blendpose value keys, which the API is unable to retrieve. Nevertheless, we found a workaround by manually adding code to the facsSolver.py file to send the blendpose keys into Unity. This solution only works when audio is played through audio2face, a feature that is unfortunately not supported from the API.
Is there any way we can access this information in a method that is automatable?
This functionality is the holy grail for metaverse/gaming, totally reactive NPCs, leave your own Avatar on in the game 24x7 because it has trained on your prior experience and transfer learned from an LLM. Audio2Face becomes the most powerful tool in the world when this problem is solved and public.
Yes the solve is built using TensorRT, but since we lack a CPU implementation, we can’t precisely measure the extent to which TensorRT accelerates computations. That said you can see the time in the UI.