That robot sounds just like you

First, OpenAI tackled text with ChatGPT, then images with DALL-E. Next, it announced Sora, its text-to-video platform. But perhaps the most pernicious technology is what might come next: text-to-voice. Not just audio — but specific voices.

A group of OpenAI clients is reportedly testing a new tool called Voice Engine, which can mimic a person’s voice based on a 15-second recording, according to the New York Times. And from there it can translate the voice into any language.

The report outlined a series of potential abuses: spreading disinformation, allowing criminals to impersonate people online or over phone calls, or even breaking voice-based authenticators used by banks.

In a blog post on its own site, OpenAI seems all too aware of the potential for misuse. Its usage policies mandate that anyone using Voice Engine obtain consent before impersonating someone else and disclose that the voices are AI-generated, and OpenAI says it’s watermarking all audio so third parties can detect it and trace it back to the original maker.

But the company is also using this opportunity to warn everyone else that this technology is coming, including urging financial institutions to phase out voice-based authentication.

AI voices have already wreaked havoc in American politics. In January, thousands of New Hampshire residents received a robocall from a voice pretending to be President Joe Biden, urging them not to vote in the Democratic primary election. It was generated using simple AI tools and paid for by an ally of Biden's primary challenger Dean Phillips, who has since dropped out of the race.

In response, the Federal Communications Commission clarified that AI-generated robocalls are illegal, and New Hampshire’s legislature passed a law on March 28 that requires disclosures for any political ads using AI.

So, what makes this so much more dangerous than any other AI-generated media? The imitations are convincing. The Voice Engine demonstrations so far shared with the public sound indistinguishable from the human-uttered originals — even in foreign languages. But even the Biden robocall, which its maker admitted was made for only $150 with tech from the company ElevenLabs, was a good enough imitation.

But the real danger lies in the absence of other indicators that the audio is fake. With every other AI-generated media, there are clues for the discerning viewer or reader. AI text can feel clumsily written, hyper-organized, and chronically unsure of itself, often refusing to give real recommendations. AI images often have a cartoonish or sci-fi sheen, depending on their maker, and are notorious for getting human features wrong: extra teeth, extra fingers, and ears without lobes. AI video, still relatively primitive, is infinitely glitchy.

It’s conceivable that each of these applications for generative AI improves to a point where they’re indistinguishable from the real thing, but for now, AI voices are the only iteration that feels like it could become utterly undetectable without proper safeguards. And even if OpenAI, often the first to market, is responsible, that doesn’t mean all actors will be.

The announcement of Voice Engine, which doesn’t have a set release date, as such, feels less like a product launch and more like a warning shot.

More from GZERO Media

FILE PHOTO: A view of a logo during the 54th annual meeting of the World Economic Forum, in Davos, Switzerland, January 19, 2024.
REUTERS/Denis Balibouse/File Photo

GZERO’s very own Tony Maciulis is in the Alps all week to report from the 55th World Economic Forum in Davos, Switzerland.

An illustration shows the US flag with the TikTok logo and a dollar in Shanghai, China, on January 21, 2025.
(Photo Illustration by Costfoto/NurPhoto)

TikTokkers may credit President Donald Trump with the app’s restoration on Sunday, which came 12 hours after a government ban shut it down, but their joy may be short-lived.

Several groups led by DACA recipients gathered at La Placita Olvera in Los Angeles, California, on November 11, 2024, for a rally and march in response to policies President-elect Trump has promised to enforce against immigrants who have entered the country.
(Photo by Jacob Lee Green/Sipa USA)

In his first hours back in office, President Donald Trump signed an executive order denying citizenship to children born to unauthorized immigrants in the US. Eighteen state attorneys general, along with San Francisco and Washington, DC, immediately sued to block the order.

A view shows Israeli tanks near the border with Gaza, amid a ceasefire between Israel and Hamas, as seen from Israel, January 21, 2025.
REUTERS/Amir Cohen

Will the Israel-Hamas ceasefire get to phase two?

FILE PHOTO: Singapore MAERSK TAURUS container ship transits through Cocoli Locks in the Panama Canal, on the outskirts of Panama City, Panama, August 12, 2024.
REUTERS/Enea Lebrun/File Photo/File Photo

Just hours after Donald Trump threatened again to take the Panama Canal in his inaugural address Monday, Panama opened a probe into a Hong Kong-based company that operates ports at both ends of the waterway.

Arauca, Colombia.- The photo shows the site of an attack with explosive devices at a military base located in Puerto Jordán in the department of Arauca, Colombia on September 17, 2024. The president of Colombia, Gustavo Petro, said that "a peace process" that his Government until now maintained with the guerrilla of the National Liberation Army (ELN) is closed, after the attack that left two soldiers dead and 26 wounded in Arauca.
ULAN/Pool / Latin America News Agency via Reuters Connect

Colombian President Gustavo Petro said Monday he will declare a state of emergency after guerilla attacks by the ELN in the northeast of the country killed at least 80 people and forced over 11,000 to flee.

In this new episode of Tools and Weapons, Microsoft's Vice Chair and President Brad Smith and Dr. Fei-Fei Li reflect on poignant moments from her memoir, "The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI," highlighting the crucial role of keeping humanity at the center of AI development. They also explore how government-funded academic research, driven by curiosity rather than profits, can lead to unexpected and profound discoveries that propel innovation and economic opportunities. Dr. Li is a pioneering AI scientist breaking new ground in computer vision, and she is a Stanford professor who is currently leading the innovative start-up World Labs. While her career is deeply rooted in technical expertise, Dr. Li's journey is driven by an insatiable curiosity. Subscribe and find new episodes monthly, wherever you listen to podcasts.

- YouTube

In a rapidly shifting geopolitical landscape, businesses are focusing on adapting to global trade uncertainties. Dr. Nikolaus Lang, global leader of the BCG Henderson Institute, shared his insights with GZERO’s Tony Maciulis during the World Economic Forum in Davos.