Tutorial: ASR (RIVA) + TTS (RIVA) + LLM (NIMs) + Audio2Face + Unreal Engine (Quickly Build Your Avatar)

cch.chichieh · June 30, 2024, 4:32pm

Recently, I created a virtual assistant using NVIDIA technologies. During this process, I encountered some challenges, such as difficulty finding RIVA documentation and issues with Unreal Engine integration. Therefore, I decided to share this project to help others quickly build their virtual assistants. Each step includes detailed instructions written by me, along with relevant official documentation.

You can find the project on GitHub: LLMAvatarTalk-An-Interactive-AI-Assistant

Demo video:
YouTube Demo

Architecture:

Features:

Speech Recognition: Converts user speech into text in real-time using NVIDIA RIVA ASR technology.
Language Processing: Leverages advanced LLM (such as llama3-70b-instruct) via NVIDIA NIM APIs for deep semantic understanding and response generation.
Text-to-Speech: Transforms generated text responses into natural-sounding speech using NVIDIA RIVA TTS.
Facial Animation: Generates realistic facial expressions and animations based on audio output using Audio2Face technology.
Unreal Engine Integration: Enhances virtual character expressiveness by real-time linking Audio2Face with Unreal Engine’s Metahuman.
LangChain Integration: Simplifies the integration of NVIDIA RIVA and NVIDIA NIM APIs, providing a seamless and efficient workflow for AI development.

Prerequisites

Nvidia Riva Server
- RIVA Step-by-step Tutorial
Audio2Face
- Audio2Face Step-by-step Tutorial
Unreal Engine & Metahuman
- Unreal Engine & Metahuman Step-by-step Tutorial

I hope these resources will help you quickly get started and create your own virtual assistant. If you encounter any issues during the process, feel free to submit an issue on the GitHub page or ask questions in the forum :)

mendicott · July 1, 2024, 9:34pm

Thank you, please see the Virtual Beings Facebook group for more on this topic.

Topic		Replies	Views
Build an avatar with ASR, ChatGPT, TTS and Omniverse Audio2Face Digital Humans (closed)	6	3108	February 19, 2024
Build a simple avatar with ASR, Sentence-transformer, Semantic Similarity Search, TTS and Omniverse Audio2Face Digital Humans (closed) audio2face	4	5037	May 27, 2022
Build an Interactive Avatar with ASR, ChatGPT, TTS with Audio2Face (From renton.hsu.vfx) Audio2Face (closed)	1	1264	August 2, 2023
TTS lipsync through Audio2face to Metahuman in realtime Audio2Face (closed)	7	3534	January 24, 2022
NVIDIA Omniverse Audio2Face App Now Available in Open Beta Technical Blog	6	730	April 26, 2021
Source code 開源：以 NVIDIA Audio2Face 和 ChatGPT 建立一個可問答互動的虛擬人 Taiwan riva , chinese , omniverse	2	1891	February 26, 2024
How to use the official API to run Audio2face and display a virtual human? Omniverse Technical Support audio2face	2	243	August 21, 2024
No chat connectors configured Maxine python	1	489	May 24, 2022
Integration of a chat bot with A2F Audio2Face (closed)	15	1904	November 13, 2021
NVIDIA Releases Jarvis 1.0 Beta for Building Real-Time Conversational AI Services Technical Blog	0	562	July 20, 2021

Tutorial: ASR (RIVA) + TTS (RIVA) + LLM (NIMs) + Audio2Face + Unreal Engine (Quickly Build Your Avatar)

Related topics