Audio2Face to FACS

Danlowlows · April 19, 2021, 5:19pm

The facial motion that Audio2Face puts out is really impressive, but it would be hard to use in a video game pipeline because it’s exporting a mesh cache instead of FACS slider data.

I’m curious if there’s any way to process/export the motion to FACS sliders or if anything like that is planned for the future?

I guess if you have a FACS rig for the example face, you could maybe train a neural network to do random FACS slider values on the rig, then check the delta between the vertex positions of the rig and the vertex positions of the mesh cache. That might be one way to do it.

RonanDB · April 20, 2021, 12:23am

Hi, Thanks for the question and interest in A2F. Currently we do not support Blendshapes. But Blendshape support is planned and will be available in the future.

siyuen · April 20, 2021, 10:35am

@Danlowlows , just curious what usage you are thinking of, and if you have specific pipeline in mind?

I am aware of gaming pipeline with joints or blendshape approaches. I like to know more specific in the use case you are thinking of. A2F actually supports blendshape fine. As Ronan mention is something we are going to add. We also have done some tests with joint constraints to the cache surface of the game rig to create animation that way. (Specific for game use context) and it worked very well.

There are other ideas we have how to convert this data you see to any FACS rig anyone has. The support of blendshape alone can get tricky as everyone may have a slightly different rig. So we want this easy for anyone. Stay tuned for more updates in this area. Let us know if you have other feedback.

Cheers.

Danlowlows · April 20, 2021, 4:55pm

@siyuen Hi there! Thanks for the reply.

What I’d like to be able to do is export motion data that can then be used on a game rig. The best format for that, at least for most modern facial rigs for games, is float channels that represent different FACS shapes. Mesh caches are not particularly useful as a data format for games (at least at runtime), for a few reasons…

Vertex motion is saving both motion and shape together, which makes it very mesh specific: You can’t port it between characters. One of the nice things about FACS is that it’s using generic descriptors like “the jaw is 48% open”, “the middle of the left eyebrow is 20% raised”, etc. Those kind of descriptors separate out the description of the motion from the shape of the face: It leaves the specifics of what to do with those descriptors, to each individual facial rig. That’s a lot more useful for game teams.
Vertex motion is difficult to edit. Say I really like the lip sync from a capture, but want to make the character smile a bit more, or raise their eyebrows more; that’s very hard to do with a mesh cache. With blendshapes, bones, FACS rigs, or really anything driven by float channels, you have controls in place that an animator can use, so it’s a lot easier to make those adjustments. This includes runtime blending of animations e.g. having the lips run on a separate animation layer to the eyes, so you can dynamically control eye look at direction, or the emotion in the brow, separate from what the person is saying.
Mesh caches are comparatively heavy, memory wise, so aren’t really as suitable as a runtime animation file format.

I appreciate your point about people having different face rigs, but once the data is in a float channel format, it’s fairly straightforward for a technical animator at a studio to write a script to convert the data if you have access to both the source and target rig. You set each channel on the source rig, one at a time, to 100%, then animate the target rig to match that shape, and then save out a mapping of the values for that shape (e.g. 100% on this channel = this combination of channel values on the other rig).

Once you have that mapping you can process all future captures very quickly. This approach isn’t possible with a mesh cache though.

If A2F supports blendshapes, I think allowing users to export the blendshape channel data, would add a huge amount of value. It would also be very helpful to include the basic blendshape rig with no motion on it, as something like an .fbx file, so that users have the reference for what each individual channel is doing, so they can match the shapes and create a mapping for the data. It wouldn’t need any complex controls or anything like that; just the float channels for each of the shapes.

Hope that all makes sense? Thanks again for taking the time to follow up.

damianlewis · April 21, 2021, 11:04pm

Totally agree and there are other live puppetry use cases where text-to-speech to A2F to blendshape to control rig is highly desirable and could significantly reduce the time and cost overhead of experimenting with different approaches.

bold.stelvis · April 22, 2021, 11:49pm

its not just games pipelines - ‘blackbox’ mesh only deformation to drive a performance is problematic for ANY cg pipeline where you can’t easily edit the results - especially for something thats as liable to need ‘direction’ as a facial performance :)

This is even an issue with dynamic sims in general and lack of ‘art direct-ability’ but at least on the ‘simulating physics’ side, the price paid in terms of flexibility for what you gain in otherwise impossible to create complexity is worth it. Face pipelines, you either need them to be VERY VERY good out the box or you need the ability to tweak (though that is not impossible to setup on top of a cache, its much easier to build onto a rig)

casadeasterion · July 24, 2021, 9:37am

Totally agree, Geometry cache is pretty difficult to handle when you combine motions from several sources. Another good option would be using Apple ARKit blendshapes, common to a lot of pipelines

siyuen · July 30, 2021, 7:00am

Thank you for all the feedback.

Some thing that might be of interest for people here are, we are soon going to release a Blendshape Solve option for Audio2Face, it will come out as 2021.3

This will allow users to have a live conversion from the final A2F result, to your custom Blendshape rig side by side. (only support blendshape at the moment)
So you can see and tune the result live.

Then you can export the blendshape keyframe information from the solve to your own software package.

Stay tuned for this update.

esusantolim · August 24, 2021, 6:56pm

Hello everyone!
Audio2Face 2021.3 is out today. You can check the announcement in the forum post here: Audio2Face 2021.3 is now available
Couple of things to highlight on this version is the blendshape solve feature, and deformation on mesh with geomSubset materials is now supported.

Topic		Replies	Views
MetaHuman lipsync not correct Audio2Face (closed)	16	3413	February 16, 2023
Transferring eye,tongue and jaw animations to blendshape model Audio2Face (closed)	13	1661	April 24, 2024
In Audio2Face 2022.1.1 any way to export the seperated eye,tongue and jaw? Audio2Face (closed)	13	2130	March 1, 2023
Can you import a Metahuman Mesh into Audio2Face? Audio2Face (closed)	11	3128	February 6, 2023
Audio2Face 2023.2.0 (Open Beta) Released Audio2Face (closed)	29	2929	July 10, 2024
Unable to load animation from audio2face into unreal, please help Omniverse Connectors audio2face , unreal-engine	31	2871	September 9, 2022
Audio2Face 2023.1 (Open Beta) Released Audio2Face (closed)	13	1905	August 24, 2023
Audio2Face 2021.3.1 - Support for Metahuman - is now live on launcher Audio2Face (closed)	12	4066	May 14, 2022
Audio2face blendshape creation issues. The blendshapes all look the same Audio2Face (closed)	15	918	September 27, 2023
Export to 3dsMax Audio2Face (closed)	14	771	May 6, 2023

Audio2Face to FACS

Related topics