That robot sounds just like you

April 02, 2024

First, OpenAI tackled text with ChatGPT, then images with DALL-E. Next, it announced Sora, its text-to-video platform. But perhaps the most pernicious technology is what might come next: text-to-voice. Not just audio — but specific voices.

A group of OpenAI clients is reportedly testing a new tool called Voice Engine, which can mimic a person’s voice based on a 15-second recording, according to the New York Times. And from there it can translate the voice into any language.

The report outlined a series of potential abuses: spreading disinformation, allowing criminals to impersonate people online or over phone calls, or even breaking voice-based authenticators used by banks.

In a blog post on its own site, OpenAI seems all too aware of the potential for misuse. Its usage policies mandate that anyone using Voice Engine obtain consent before impersonating someone else and disclose that the voices are AI-generated, and OpenAI says it’s watermarking all audio so third parties can detect it and trace it back to the original maker.

But the company is also using this opportunity to warn everyone else that this technology is coming, including urging financial institutions to phase out voice-based authentication.

AI voices have already wreaked havoc in American politics. In January, thousands of New Hampshire residents received a robocall from a voice pretending to be President Joe Biden, urging them not to vote in the Democratic primary election. It was generated using simple AI tools and paid for by an ally of Biden's primary challenger Dean Phillips, who has since dropped out of the race.

In response, the Federal Communications Commission clarified that AI-generated robocalls are illegal, and New Hampshire’s legislature passed a law on March 28 that requires disclosures for any political ads using AI.

So, what makes this so much more dangerous than any other AI-generated media? The imitations are convincing. The Voice Engine demonstrations so far shared with the public sound indistinguishable from the human-uttered originals — even in foreign languages. But even the Biden robocall, which its maker admitted was made for only $150 with tech from the company ElevenLabs, was a good enough imitation.

But the real danger lies in the absence of other indicators that the audio is fake. With every other AI-generated media, there are clues for the discerning viewer or reader. AI text can feel clumsily written, hyper-organized, and chronically unsure of itself, often refusing to give real recommendations. AI images often have a cartoonish or sci-fi sheen, depending on their maker, and are notorious for getting human features wrong: extra teeth, extra fingers, and ears without lobes. AI video, still relatively primitive, is infinitely glitchy.

It’s conceivable that each of these applications for generative AI improves to a point where they’re indistinguishable from the real thing, but for now, AI voices are the only iteration that feels like it could become utterly undetectable without proper safeguards. And even if OpenAI, often the first to market, is responsible, that doesn’t mean all actors will be.

The announcement of Voice Engine, which doesn’t have a set release date, as such, feels less like a product launch and more like a warning shot.

Mining for peace: can a US-brokered deal end the conflict in the DRC?

U.S. President Donald Trump speaks with Democratic Republic of the Congo's Foreign Minister Therese Kayikwamba Wagner and Rwanda's Foreign Minister Olivier Nduhungirehe on June 27, 2025.

REUTERS

On June 27, the Democratic Republic of Congo and Rwanda signed a US-mediated peace accord in Washington, D.C., to end decades of violence in the DRC’s resource-rich Great Lakes region. The agreement commits both nations to cease hostilities, withdraw troops, and to end support for armed groups operating in eastern Congowithin 90 days.

Why life sciences are critical to national security

What if the next virus isn’t natural, but deliberately engineered and used as a weapon? As geopolitical tensions rise and biological threats become more complex, health security and life sciences are emerging as critical pillars of national defense. In the premiere episode of “The Ripple Effect: Investing in Life Sciences”, leading experts explore the dual-use nature of biotechnology and the urgent need for international oversight, genetic attribution standards, and robust viral surveillance.

What We’re Watching: Budapest Pride parade, Rwanda and DRC peace agreement, SCOTUS ruling on Trump’s executive power

A woman lights a cigarette placed in a placard depicting Hungary's Prime Minister Viktor Orbán, during a demonstration, after the Hungarian parliament passed a law that bans LGBTQ+ communities from holding the annual Pride march and allows a broader constraint on freedom of assembly, in Budapest, Hungary, on March 25, 2025.

REUTERS/Marton Monus

Hungary’s capital will proceed with Saturday’s Pride parade celebrating the LGBTQ+ community, despite the rightwing national government’s recent ban on the event.

That robot sounds just like you

GZEROMEDIA

More from GZERO Media

What Zohran Mamdani’s win really signals for US politics

In this episode of Ian Bremmer’s Quick Take, Ian digs into the surprise Democratic primary victory of Zohran Mamdani in New York’s mayoral race and why it might be “an early signal of something much bigger in the United States.”

What We’re Watching: Senate vote on Trump’s big bill, Thai PM in hot water, Japan's name-change game

Mining for peace: can a US-brokered deal end the conflict in the DRC?

Graphic Truth: Last democracy domino falls in Hong Kong

Hong Kong’s League of Social Democrats, the last active pro-democracy party in the Chinese-controlled territory, announced on Sunday that it was disbanding.

HARD NUMBERS: China becomes top Cuba benefactor, Canada backs down, & More

GZEROMEDIA

Why life sciences are critical to national security

How Ayatollah got ghosted

"Thought we were bros"

Key takeaways from the 2025 NATO Summit

In this episode of Europe in :60, Carl Bildt discusses the outcomes of the NATO Summit and where Europe stands with the Israel-Iran conflict.

HARD NUMBERS: Wintour steps down, Top Chinese official ousted, Norwegian royal faces rape charges, US funds controversial aid organization

What We’re Watching: Budapest Pride parade, Rwanda and DRC peace agreement, SCOTUS ruling on Trump’s executive power

Hungary’s capital will proceed with Saturday’s Pride parade celebrating the LGBTQ+ community, despite the rightwing national government’s recent ban on the event.

That robot sounds just like you

GZEROMEDIA

Subscribe to GZERO Media's newsletter: Signal

More from GZERO Media

In this episode of Ian Bremmer’s Quick Take, Ian digs into the surprise Democratic primary victory of Zohran Mamdani in New York’s mayoral race and why it might be “an early signal of something much bigger in the United States.”

Hong Kong’s League of Social Democrats, the last active pro-democracy party in the Chinese-controlled territory, announced on Sunday that it was disbanding.

GZEROMEDIA

Subscribe to GZERO Media's newsletter: Signal

"Thought we were bros"

In this episode of Europe in :60, Carl Bildt discusses the outcomes of the NATO Summit and where Europe stands with the Israel-Iran conflict.

Hungary’s capital will proceed with Saturday’s Pride parade celebrating the LGBTQ+ community, despite the rightwing national government’s recent ban on the event.