Nvidia's AI Model Can Synthesize Sounds That Never Existed

Photo Credit: Yassine Ait Tahit

Nvidia unveils its Fugatto generative AI model, capable of synthesizing sounds that have never existed.

Nvidia recently announced its new AI audio generator, “Fugatto,” which can synthesize music, speech, or sounds based on a text prompt. What sets Fugatto aside from other generative AI audio models is its inference level techniques enabling it to transform any mix of audioincluding the creation of sounds that have never been heard before.

A “Swiss Army knife for sound,” Fugatto is able to put together songs based on some pretty novel prompts, like creating a trumpet that meows or a saxophone that barks. Whatever a user can describe, the model can create, according to Nvidia. Other examples provided by the company include the ability to produce unique sound effects from a description: “Deep, rumbling bass pulses paired with intermittent, high-pitched digital chirps, like the sound of a massive sentient machine waking up.”

The model is able to edit music, such as isolating the vocals in a song, changing instruments, or switching up the melody. Fugatto can even transform the sound of someone's voice, such as changing their accent or giving them a calm or angry tone.

“We wanted to create a model that understands and generates sound like humans do,” said Rafael Valle, a manager of applied audio research at Nvidia, and one of the researchers behind Fugatto, who is also an orchestral conductor and composer. “Fugatto is our first step towards a future where unsupervised multitask learning in audio synthesis and transformation emerges from data and model scale.”

One of the hardest parts of developing such a robust model, according to the company, was generating a blended dataset containing millions of audio samples used for training. “The team employed a multifaceted strategy to generate data and instructions that considerably expanded the range of tasks the model could perform, while achieving more accurate performance and enabling new tasks without requiring additional data,” says Nvidia.

The model is currently not publicly available, and it's the company did not reveal what the timeline for that might look like, or if it will ever become widely available. A website full of samples showcases its uses, providing a glimpse into the future of what's possible with ethical generative AI.

Nvidia's AI Model Can Synthesize Sounds That Never Existed

Nvidia unveils its Fugatto generative AI model, capable of synthesizing sounds that have never existed.

Drake accuses Universal Music Group and Spotify of unfairly promoting Kendrick Lamar's 'Not Like Us'

Gamma., Kakao & SM Entertainment to Launch British Boy Band dearALICE Via New Global Alliance

MAX CAVALERA Is Returning To 'Tribal Groove Power' On Next SOULFLY Album: 'Let's Go Back To What It Was In The Beginning'

A Thanksgiving mix: Songs of joy and gratitude : All Songs Considered

Rod Stewart to Perform at Glastonbury Festival 2025 in Legends Slot

THUNDERSTROKE

Nvidia unveils its Fugatto generative AI model, capable of synthesizing sounds that have never existed.

You may also like

Drake accuses Universal Music Group and Spotify of unfairly promoting Kendrick Lamar's 'Not Like Us'

Gamma., Kakao & SM Entertainment to Launch British Boy Band dearALICE Via New Global Alliance

MAX CAVALERA Is Returning To 'Tribal Groove Power' On Next SOULFLY Album: 'Let's Go Back To What It Was In The Beginning'

A Thanksgiving mix: Songs of joy and gratitude : All Songs Considered

Rod Stewart to Perform at Glastonbury Festival 2025 in Legends Slot