‘Deep Voice’ Software Can Clone Anyone’s Voice With Just 3.7 Seconds of Audio

With just 3.7 seconds of audio, a new AI algorithm developed by Chinese tech giant Baidu can clone a pretty believable fake voice. Much like the rapid development of machine learning software that democratized the creation of fake videos, this research shows why it’s getting harder to believe any piece of media on the internet.

Researchers at the tech giant unveiled their latest advancement in Deep Voice, a system developed for cloning voices. A year ago, the technology needed around 30 minutes of audio to create a new, fake audio clip. Now, it can create even better results with just a few seconds of training material.

Of course, the more training samples it gets, the better the output: One-source results still sound a bit garbled, but it doesn’t sound much worse than a low-quality audio file might.

