OpenAI unveils video AI model Sora capable of generating 60-second clips

2024/02/18 Innoverview Read

OpenAI is not content with just being known as the ChatGPT or even LLM company: today it unveiled a demo of Sora, its new AI text-to-video generation model, with co-founder and CEO Sam Altman posting on X (formerly Twitter) that it was a “remarkable moment.”

While the product is not officially usable by the masses yet due to what Altman said in his post was “starting red-teaming,” or oppositional testing of its security defenses, flaws and misuses, the founder did note that it was being made available to a “limited number of creators,” with public expansion to come at a later date.

Intensely competitive space for video AI models

Sora is entering an intensely competitive space with existing rival startups RunwayPika and Stability AI offering dedicated AI video generation models, as well as stalwarts such as Google showing of its Lumiere model capabilities.

But OpenAI’s sample videos of Sora shared today stand out for the sharpness of their resolution, smoothness of motion, human anatomical and physical world accuracy, and most of all, run-time.

Unlike Runway and Pika, which offer just 4 seconds of generation at a time with options to expand, OpenAI’s Sora offers 60-second video generations right off the bat.

Altman and other members of OpenAI’s leadership and Sora team including researcher Will Depue are collecting prompts from users on Twitter/X that they are running through Sora now as a kind of live, crowdsourced demo of the model’s new capabilities, so go over and submit some to them if you are interested (I did).

— Sam Altman (@sama) February 15, 2024

More than even the fantastical videos, the videos showing Sora’s capabilities at replicating mundane but recognizable moments of human life — such as watching the cityscape pass from an elevated train, or a home movie of a woman in bed with a cat — are shockingly realistic.

Also impressive and potentially alarmingly, OpenAI researcher Bill Peebles, who is working on the company’s effort to develop “artificial generalized intelligence,” (defined as AI that performs better than most humans at most economically valuable tasks) noted that Sora would help the quest for AGI by “simulating everything.”

Sora is here! It's a diffusion transformer that can generate up to a minute of 1080p video with great coherence and quality. @_tim_brooks and I have been working on this at @openai for a year, and we're pumped about pursuing AGI by simulating everything!

— Bill Peebles (@billpeeb) February 15, 2024

Amid a renewed push by U.S. federal agencies to regulate AI specifically for its potential for fraud and deepfakes of real people, the advent of Sora seems like a milestone for not just OpenAI, but the entire tech and media industry, and humanity generally — though for better or worse, remains to be seen.

(Copyright: VentureBeat OpenAI unveils video AI model Sora capable of 60-second clips | VentureBeat)