Running audio2face microservice container in serverless compute

Hey.
I am working on a Digital Human startup, and I am using audio2face microservice through Azure Marketplace Nvidia AI Enterprise.

This costs me about 1.5 usd / hr to run (0.5 usd gpu instance + 1usd for Ai enterprise subscription), which is about 1000+ usd a month which is a lot.

A2F only has to run after the user sends a prompt, and a response audio file is created which is sent to A2F for animation synthesis. It only has to run for few seconds after each request.

Therefore, I think it makes more sense to deploy the container on some serverless service with gpu compute like AWS Fargate/ECS, Azure ACI, etc.

What are the shortcomings of the solution? If it is good, then how do I do it, thats the main question, since my AI enterprise subscription is through Azure marketplace VM. Just need some basic guidance.

Thank you in advance.

Hi @varun20 – I’ll try to get more details specific to Audio2Face but I think you will probably have some challenges getting access to GPUs on serverless cloud platforms. Fargate doesn’t support GPUs at all (AWS Fargate GPU Support: When is GPU support coming to fargate? · Issue #88 · aws/containers-roadmap · GitHub) and the ACI docs on GPU usage point to a number of limitations (Deploy GPU-enabled container instance - Azure Container Instances | Microsoft Learn) – in particular, only support for V100 GPUs and an 8-10 minute deployment time.

Hey @neal.vaidya thank you so much for the quick response. Ya I think it will be tricky to go serverless with this.

The other alternative way in the A2F docs is to use Nvidia Cloud Functions: NVIDIA NIM | audio2face. This can be viable solution to my problem.

But I am not sure whether this API supports production level deployments. All I could learn about this is that you can get upto 5000 API credits for business accounts, but I can’t find anything beyond that (no documentations/pricing).

If I can get more specific info on the A2F cloud function - mostly regarding pricing and whether i can use it for production level applications - it would be great. Also I forgot to mention earlier that I am developing in Unreal Engine. This doc is my main reference: NVIDIA ACE 2.1 Plugin — ACE documentation latest documentation

Basically any serverless method would be great for us. whether we deploy the A2F microservice container on a serverless service on a cloud platform, or we get access to audio2face cloud function at a production level.

It only makes most sense to pay per request. That’s what we need.