Welcome to your AI video fever dream

Midjourney

Generative AI lets people craft sprawling essays, create detailed images, and even clone their own voice with remarkable precision. But taking an AI-generated video service for a spin made me realize that the technology is still far from creating convincing or cinematic video. In fact, the entire experience was surreal.

Luma AI’s Dream Machine, a free text-to-video service, warns users that they’re limited to 10 videos per day, and 30 videos per month, due to high demand — unless they pay at least $29.99 a month for the starting subscription tier. But I only needed to wait a couple of minutes to get my first prompts turned into … very, very strange videos.

I started with a simple request: Can you generate a video of a baseball player hitting a ball out of the park?

The results were astonishingly bizarre. Instead of a smooth, realistic depiction of a home run, what I got was a fever dream. The video featured an old man contorting his body in impossible ways, simultaneously attempting to swing a bat and prepare to throw (or catch?) a ball. While the stadium background looked reasonably accurate, the player’s movements were distorted, his jersey number blurred, and his face twisted unnaturally as he moved. Meanwhile, the bat morphed in size as he swung, and the words on the stadium signs were incoherent.

Determined to achieve a more precise outcome, I decided to try a prompt generated by ChatGPT. Sometimes the robots are best at talking to other robots.

The prompt described a sunny afternoon at a modern baseball stadium filled with cheering fans, detailing vibrant team colors and the batter’s white uniform with blue pinstripes. I requested a pitcher in a dark blue uniform throwing a fastball, a batter’s level swing, a monster home run, and the crowd’s roaring applause.

The result was even more disconcerting. The batter appeared to be hugging himself while morphing into a strange creature. Fans inexplicably sat near home plate, which transformed into an arch shape with some strange object on top. The batter was facing the wrong direction — or was that the catcher?

Given the perennial fear of deepfake videos and misinformation, I prompted the model to give me videos of Joe Biden, Donald Trump, Pope Francis, and Barack Obama giving speeches — but it refused. It did, however, agree to create a video of basketball star Michael Jordan giving a speech in a school gym.

The video showed a figure who kind of looked like Jordan for a split second before inexplicably morphing into a completely different-looking person. Meanwhile, another figure shuffled by like a zombie in ill-fitting pants. The gym setting was almost right, except for a riser cutting off someone’s legs, incorrect basketball markings on the floor, and a basketball hoop seemingly painted on the wall.

My editor Matt Kendrick, an Emmy-nominated TV producer in a former life, also gave it a try. His first effort to work up a thrilling historical drama set in medieval Mongolia resulted in a somewhat disturbing reverse-centaur situation.

But maybe the software is designed for the format of a proper Hollywood script, something like, say, the 2004 Kal Penn/John Cho opus “Harold and Kumar go to White Castle.” Alas, pasting in that finely crafted script resulted in nothing more than a clip of a man taking a phone call in an indecipherable language while sitting at a desk spruced up with the flag of the Belarusian democratic movement and some rather phallic decorations.

Text-to-video models like Luma AI or OpenAI’s still-under-wraps model, Sora, promise to make lifelike scenes — but the technical challenges we saw in our initial test suggest that this technology is still a ways away. The glitchiness, blurriness, and jarring incoherence were not evidence of a model that could confuse anyone — at least not without serious improvement. So Hollywood shouldn’t be worried just yet.

The bar for success is high but not impossible — and regulators should plan ahead. If video generation technology is cheap and powerful, it could be used to scam people, deceive them, and even disrupt elections. Earlier this year, an employee at a bank in Hong Kong was defrauded into paying over $25 million by deepfakes of the company’s chief financial official on a video call. And AI-generated recordings, photos, avatars, and text have played a role in influencing politics this year — so it’s only a matter of time before AI-generated video causes a stir.

Nick Reiners, senior analyst for geotechnology at Eurasia Group, says that while regulators haven’t cracked down on text-to-video models, a major global focus is transparency – “so you know you’re looking at deepfakes,” he said. That’s a principle of the European Union’s AI Act, the G7’s Hiroshima Process, and the Biden administration’s executive order on AI.

Reiners sees hesitation from major AI companies in releasing models and chalks it up more to the negative societal externalities than the products being technically underwhelming. “You look at the amount of progress that image generators have had in recent years, and you'd assume we see a similar improvement curve with video,” he said.

The two big issues, in Reiners’ view, are disinformation and sexual abuse material, and he thinks the latter might be addressed first: “There’s a big push on both sides of the aisle to protect children.” When video models improve, it may be deepfake of obscene or indecent nature that causes a ruckus before it can help throw an election one way or another.

More from GZERO Media

Heavily armed police officers secure the scene. A car has crashed into a Christmas market in Magdeburg. Several people are killed and many injured.
Heiko Rebsch/dpa via Reuters Connect

The Saudi doctor accused of killing 5 people in the Magdeburg Christmas market on Friday appeared in a German court on Saturday.

Donald Trump speaks on the last day of Turning Point's four-day AmericaFest conference on Dec. 22, 2024, in Phoenix.
USA TODAY NETWORK via Reuters Connect

President-elect Donald Trump’s advisors are reportedly urging him to pull the United States out of the World Health Organization on his first day in office, according to a report published Sunday in the Financial Times.

A ship passes through the Panama Canal's Culebra Cut, heading northbound for the Caribbean, Dec 30. The Canal, built and operated by the United States, will transfer to Panamanian control at a noon ceremony on December 31.
REUTERS

The President-elect is also making waves for saying that the United States must "retake" control of the Panama Canal.

Canada's Prime Minister Justin Trudeau speaks at a meeting of the Calgary Chamber of Commerce in Calgary, Alberta, Canada December 21, 2016.
REUTERS/Todd Korol

Bad news for embattled Canadian Prime Minister Justin Trudeau: On Saturday, 51 members of his Liberal Party’s powerful Ontario caucus reportedly agreed that he should resign, citing their plummeting fortunes under his leadership.

A view is being seen of the northeast of Tehran at sunrise on August 17, 2012.
Photo by Morteza Nikoubazl/NurPhoto via Reuters

After weeks of increasingly severe blackouts caused by massive natural gas shortages in Iran, the state power company warned manufacturers on Friday that they need to brace for power cuts that could last weeks and cost billions of dollars.

- YouTube

From Russia to China to the Middle East, what are the biggest threats facing the US? On GZERO World, outgoing National Security Advisor Jake Sullivan joins Ian Bremmer in front of a live audience at the 92nd Street Y in New York City for a wide-ranging conversation on America’s view of the world, President Joe Biden’s foreign policy legacy, and how much will (or won’t) change when the Trump administration takes office in 2025.