MetaHuman lipsync not correct

So I went through several Audio2Face tutorials to get a MetaHuman talking / singing in UE5 and I am very disappointed in the results. The animation is adequate at best inside Audio2Face but after exporting the USD blend shapes to UE5 and applying to my MetaHuman the results are vastly different… the mouth never closes and the animation is very different from how it is in Audio2Face. Inside UE5 it needs tons of cleanup work to even look remotely believable. Im so dissatisfied and disappointed in these results! I was really hoping for so much more! And why doesn’t this work on actually singing voices ???.. and im not talking about rap!
In order to get it to even remotely work on singing vocals I had to actually re-record the voice track and speak what is normally sung just to get it to track and even then the results are not great. Am I missing something?

1 Like

We are aware of some of the export to Metahuman requires specific tuning. The short version of it is, the blendshape solve and export mapping to the Metahuman rig can be better tuned out of the box. We are planning to improve this in short term so the blendshape solve works better for Metahuman by default.

In the meantime, you can tune the blendshape solve and try a different export to see, we have seen better results from that.

For Singing, it really depends. Since Audio2Face is sound based, it can cover some rapping and singing but is not optimal for any type of song or singing. Some will work better than others. It is more tuned for speech based animation at the moment.

Thanks for the feedback.

When you say “you can tune the blendshape solve” what are you referring to? How do you tune it?

So, Audio2Face will work with singing voices in the future? This will open up a ton of additional use cases if it can do accurate lip sync and facial animation for singing characters.

I wish the transfer to Metahuman characters was more automatic and an exact copy of the animation I see inside Audio2Face. Even better would be to preview the animation on a MetaHuman face inside Audio2Face. I see on your website that you show other character faces imported into Audio2Face.

Is it better / more accurate to export to Maya and then import to Unreal Engine? Or export to iClone and then import to Unreal. Which option provides the best results?

Any tutorials on how to tune the blendshape solve or how to fix it once it’s inside Unreal Engine? I tried using the additive animation layer once inside Unreal Engine… but there is so much to fix that it’s almost not even worth using Audio2Face… but maybe I’m doing something wrong?

Also, what about the facial expressions and the eyes? When is that update going to be released? I really need to get accurate lipsync and facial animation for my Metahuman character and I was counting on this solution to work… so much so that I went out and bought a high-end gaming PC with an RTX3090 specifically for this.

Hello @idancerecords! You can take a look at our Audio2Face documentation for helpful information, tutorials, and videos.

Here is a link to our full documentation where you can learn more about all the features of Omniverse.

Also, Check out the Learn tab on the Omniverse launcher!

Thank you, but I have already reviewed all the Audio2Face documentation and tutorials and none of them addresses the issues that I have mentioned above regarding tuning the blendshapes solve or exporting to MetaHuman. I really want to like this software but at this point I can’t really recommend it to anyone without these glaring issues being resolved which basically makes the software unusable and a huge waste of time. Please fix the software so that it does what you claim it can do and create some documentation and tutorials specifically for “tuning the blendshape solve” for MetaHumans or just make it work without having to “tune” anything. I don’t want it to be kind of correct, I want it to be exactly correct or at least 90% correct out of the box so that the animation is actually usable.

1 Like

(sorry for cross-posting @siyuen )

+1 on this. I used Audio2Face 2021.3.3. export to MetaHumans in Unreal 4 with great results and eagerly awaited the 2022.1 release. But when I tried to export data from the new “Full Face” model to MetaHumans in Unreal 5, the lip sync completely failed.

Compare this screen capture in audio2face 2022.1 to the export of the same motion data to MetaHuman 5 via Bluprint.

After so many pushed back release dates due to quality assurance and the GREAT new emotion features, this is kind of frustrating. Can I kindly ask the awesome NVIDIA engineers on this project to look into this and post a fix?

Rather than waiting for another full update, maybe posting an just updated blueprint for export to MetaHumans/UE5 as a separate file would be faster and therefore much appreciated.

Thank you very much in advance for considering this.

I’ve been facing the same problem on UE4 4.27. The great lip-synch is destroyed when I export to MetaHuman Sequence. I tuned the blendshape solve (something between 3), but it didn’t help much.

We are aware of these issues and are looking into improving the blendshape export quality in particular to Metahuman. But also in general more documentation and tips on how to use the blendshape solve so users can more easily export the best performance out of Audio2Face through blendshape in other apps.

The animation posted seem very wrong. We are not getting that same result here on other tests in Metahuman.

We will take a closer look and test this, if you have any scene file to share even that wav file, that will help us narrow down the issue. Thanks.

Thanks for reaching out, @siyuen !

Here’s the audio file:

Will try to get the scene file asap, it’s on a remote machine…

Hi @feel.or.fake ,

Thanks for providing the audio file. I downloaded the audio file and run it with the default settings and the result on Metahuman is this for your debug purpose. (although tweaking some parameters would make this look better)
Can you also test this attached .usd file that I exported and see if it makes okay animation on your side?
stani_tony_neutral.usd (769.2 KB)

Hello @siyuen @yseol ,
I know this is not just Nvidia’s responsibility, at least I see that very good facial animations can be made in A2F, but when we transfer it to Unreal, this is no longer the case. So I would like to ask if there has been any change since the others reported this problem, and you are aware of this issue? Thanks in Advance.

Hi @daniel.magyar, welcome to Audio2Face.

We hear this issue time to time. But it hasn’t been clear, if it was a bug or user was doing something different than we expected.
There would be some loss of details when we transfer an animation from A2F to MH (due to different rig system & shape of poses), but it shouldn’t break the overall animation or doing completely wrong thing.
If you experience something wrong, please share with us, so we can identify what make it wrong. Thanks!

Hi @yseol ,

Thank you so much for getting back to me so quickly. I want to assure you that every step I took in this process was based on Nvidia’s official videos, following them step by step. However, I will still provide you with a detailed breakdown of my pipeline. I did everything based on the videos that you made and narrated.

Initially, I didn’t focus on generating a custom mesh and I didn’t spend any time creating blend shapes for it.
I jumped right into the third step, but soon realized that the facial animation in A2F didn’t match what was visible in Unreal Engine. At that point, I thought that the difference and inaccuracy might be due to the A2F mesh not matching the Unreal mesh (MetaHuman face mesh).
So, I decided to create everything from the first step on the MetaHuman mesh, ensuring that the movement was consistent from start to finish.
Unfortunately, this still didn’t provide a solution. As seen in your official videos, it’s evident that the Blendshape Solver’s face moves slightly differently than the original A2F mesh. However, this is still far superior to what is possible in Unreal, which is simply unacceptable.
It’s a shame that we can’t utilize A2F’s incredible quality properly yet.

So here is my pipeline:

  1. First of all I was generate blendshapes on my custom face mesh (MetaHuman face)
    using the blendshape generation tool located in the character transfer tab.
    Reference video: BlendShape Generation in Omniverse Audio2Face - YouTube

  2. Then, I was connect this custom character with BlendShapes to Audio2Face.
    Reference video: BlendShapes Part 4: Custom Character Conversion and Connection in Omniverse Audio2Face - YouTube

  3. Everything goes fine. I’ve get beautiful results, my custom mesh speaks and moves perfectly
    After these I was export the Blaendshape Solver in USD, I have imported this into Unreal Engine as “Import Facial Animation”. Just as in your video.
    Reference video: Overview of the Omniverse Audio2Face to Metahuman Pipeline with Unreal Engine 4 Connector - YouTube

However, the end result in Unreal Engine is completely disconnected from the beautiful and flawless movement that came from A2F. This is my unlisted video with the results. Unlisted video here: A2F to UE - YouTube

I hope I was able to help by describing step by step what I did and how I did it. I have to say, I’ve already done this roughly ten times a day this week. So, I think that if I had made a mistake once or twice, it would have become apparent, especially since this is my profession, which I’ve been doing for over 23 years. I’m giving up now because I don’t think it’s up to the users. It seems that this is something that happens or changes somewhere between A2F and UE5.1 - Metahuman, and there’s nothing we can do about it.

But thank you very much for addressing this issue! I think this tool provides the best results that can be achieved on the market today. I believe that A2F has tremendous potential, but unfortunately, the significant loss of quality resulting from the Unreal integration cuts it off.

I’m looking forward and have a nice day!

Hi @daniel.magyar

Thanks for the detailed explanation.
One question, in the latter part of the video, you have 2 animation clips that looks better than the first one. how they are generated?

I guess some tweaking on the blendshape solve setup can make this better. For example you can turn on/off some blendshape poses in this UI. I suggest you turn off some of mouth shapes like “jawDropLipTowards” (this makes metahuman’s lower lip goes upward I think). You can also try turning off some depressor related shapes for tests.
Tweaking regularization parameters and symmetry setting would be nice too for details.

Even if you tweak them carefully, the auto-remapping sometimes cannot achieve the artists’ high quality. In that case, you can use Maya to tweak the animation like in this video and export to UE.

If you still get bad animation as shown in the video after this suggestion, please let us know + please attach the .usd file you exported.

I’m sorry but the response to these issues from Nvidia is just unacceptable! Nobody wants to do all that extra work to “maybe” get an acceptable result when your software is supposed to do it all and it’s supposed to work AND you have known about this issue for like a year now!

Just fix it already and stop making excuses and stop assuming we are all doing something wrong! If you advertise a feature then it should work exactly as advertised! Period!

I suspect that you really don’t care about these animations working on MetaHumans because you have your own avatar system in the works… which is fine but then don’t claim that it works with MetaHumans and waste everyone’s time!!
Either fix your software so it does what you are claiming it can do Or remove the feature because it’s broken and you can’t fix it!

I’m upset about this because this software definitely has potential but it’s currently a big waste of time with unacceptable results.
If you still disagree then prove it! Make a step by step tutorial that shows it working … a tutorial that anyone can follow and produce the same results!

Hi @yseol

Thank you for the quick and detailed response!

I created those animations using the same setup, but the only difference is that I attempted to tweak the parameters within A2F. However, it’s apparent to everyone that they don’t come close to the same level of quality achievable within A2F. If A2F is at 100%, then the Unreal version is only at about 20%. That’s the crux of the issue. Modifying blendshapes like “jawDropLipTowards” may result in slightly better quality, but it’s not just a small difference; it’s an entirely different type of motion from the original A2F animation.

To be honest, I’m not an expert in this area, so I wouldn’t dare to criticize your work. I think what you’ve done is brilliant, and A2F will likely revolutionize the industry. However, I do notice that many subtle movements that make A2F so realistic are lost when exporting to UE. As you mentioned, the differences between the two face rigs (A2F vs. MetaHuman) are likely the root cause of the issue. Given this, I don’t see much point in tweaking the parameters since it’s unclear whether these small changes will come close to matching the original A2F motion.

But I do have a question: is there a way to transfer the face mesh to UE and replace the original MetaHuman face, like a geometry cache? What I mean is, just like how we animate the face while MetaHuman provides the hair, eyebrows, and eyelashes, can we replace the original face with a different one while leaving everything else as the MH base? I’m not sure. I’m out of ideas. This problem seems to be beyond the scope of simple users like myself who tweak parameters here and there. It feels like the only real solution is to synchronize the two facial rig systems, which is obviously a complex task.

What are your thoughts? Is there any chance we could have a session or meeting where we can share our screens and do the entire process on my computer based on your instructions? This way, we can all see what we’re doing differently or what might be causing the discrepancy between the original A2F face and the MH face.
Please DM me with the details if it’s possible.

One more thing, don’t take everything too seriously. After all, A2F is still in its early stages and it’s accessible to everyone for free. From my perspective, this is just an initial problem that the programmers will likely be able to solve easily over time. I’m feeling optimistic and rooting for you to resolve this issue as soon as possible!

Many Thanks!