Wednesday, August 10th, 2022 Posted by Jim Thacker

Omniverse Audio2Face gets new Audio2Emotion system


Nvidia has released Omniverse Audio2Face 2022.1, the latest version of its experimental free AI-based software for generating facial animation from audio sources.

The release adds Audio2Emotion: a new system that detects an actor’s emotional state from their voice, and adjusts the performance of the 3D character accordingly, enabling it to express emotions like joy or pain.

The underlying technology forms part of Omniverse Avatar Cloud Engine, an upcoming set of AI models and services for generating interactive online avatars, announced by Nvidia at Siggraph 2022.

Generate automatic lip-sync and facial animation for Character Creator characters from audio files
First released last year, Audio2Face is an AI-trained tool for generating facial animation for a 3D character from audio sources: either offline recordings of speech, or a live audio feed.

Along with sister app Omniverse Machinima, the software is one of a set of new games tools Nvidia is developing around Omniverse, its USD-based real-time collaboration platform.

As well as using the animation it generates inside Omniverse itself, users can convert its output to blendshape-driven animation for export to other DCC applications in USD format.

Now automatically modifies facial expressions to match an actor’s emotional state
To that, Omniverse Audio2Face 2022.1 adds support for facial emotions, via a full-face neural network “trained with a range of emotions like joy, amazement, anger, and sadness”.

The new Audio2Emotion system infers the emotional state of an actor from their voice and adjusts the facial performance of the 3D character it is driving accordingly.

The result is an automatially-generated animation that includes not only lip and facial movements that match the audio track, but changes of expression that match the actor’s shifts of emotion.

Users can adjust the performance manually, either using a set of simple slider controls, or by conventional keyframe editing: as well as the new Emotion Panel, a new “simple key framing UI” has been added to the software’s Post-Processing panel.



Part of the underlying technology of Nvidia’s upcoming Omniverse Avatar Cloud Engine
The underlying technology of Audio2Face and Audio2Emotion also forms part of Omniverse Avatar Cloud Engine (ACE), Nvidia’s new “cloud-native AI for interactive avatar development”.

The suite of AI models and services is aimed at software developers, and is intended to streamline the process of creating chatbots, virtual assistants and NPC characters in games and real-time applications.

It is designed to be platform-agnostic – the demo above shows a user interacting with a MetaHuman character in Unreal Engine – and will run on both embedded systems and “all major cloud services” on its release next year.

Pricing and system requirements
Omniverse Audio2Face is available in beta for Windows 10. It needs a Nvidia RTX GPU: the firm recommends a GeForce RTX 3070 or RTX A4000 or higher. All of the Omniverse tools are free to individual artists.


Read a full list of new features in Omniverse Audio2Face 2022.1 in the online release notes

Download Omniverse Audio2Face from Nvidia’s product website