AI video platform Runway will release its Gen-3 model “in the next few days” and it will include “major improvement in fidelity, consistency, and motion over previous generations of models,” while also being considerably faster, the company told Tom’s Guide.

Runway released Gen-2, the first commercially available text-to-video AI model in June last year and since then a revolution in synthetic video has been unleashed on the world. It now competes with the likes of Pika Labs, Haiper, Luma Labs and the yet-to-be-released Sora.

Gen-3 is a major step-change for Runway and the AI video space. It was rebuilt from the ground up using a new generation infrastructure purpose-built for large-scale multimodal training. This new model was trained on image and video at the same time for improved realism.

The public will be able to get access “in the next few days” to an Alpha version. Anastasis Germanidis, Runway CTO and Co-Founder told me this was the smallest of a new generation of frontier AI models coming from the coming as a result of the new training infrastructure.

What makes Runway Gen-3 different?

(Image credit: Runway Gen-3)

Runway Gen-3 includes an improved ability to control motion within a video as well as understanding real-world movement and physics. Combined with its photorealism and you’ve got a model that can create videos almost indistinguishable from reality.

Gen-3 Alpha improves significantly in terms of temporal consistency and has much-reduced morphing compared to Gen-2 for both text and image inputs. Anastasis Germanidis, Runway CTO

There were some surprises for the team when first using Gen-3 after it completed training including its approach to scene creation. This is something possible thanks to a minimum 10-second video creation. The previous generation capped out at about four seconds.

“The ability to create unusual transitions has been one of the most fun and surprising ways we’ve been using Gen-3 Alpha internally,” said Germanidis. He told me: “The model is able to incorporate and make sense of drastic changes in the environment with very pleasing results.”

As well as changing the scenes and environment you have much greater degrees of “temporal control” as it was trained with “multiple highly descriptive captions per scene, which makes it capable of generating videos that have unusual and interesting transitions of environment and action, as well as precise key-framing of specific elements in time,” he explained.

“These model improvements paired with existing control modes such as Motion Brush, Advanced Camera Controls, and Director Mode give our users more control than ever before.”

You can start with images, text or even video using Gen-3, whereas Gen-2 doesn’t support video as an input. It doesn’t matter which you use, according to Germanidis. “Gen-3 Alpha improves significantly in terms of temporal consistency and has much-reduced morphing compared to Gen-2 for both text and image inputs.”

Creating a General World Model

(Image credit: Runway Gen-3)

Germanidis told Tom’s Guide this was the “first of the next generation of foundation models trained by Runway from the ground up”. He added that future versions “will reach and exceed the scale of large language models,” such as Google Gemini and Anthropic’s Claude.

The model can struggle with complex character and object interactions, and generations don’t always follow the laws of physics precisely. Anastasis Germanidis, Runway CTO

In the same way the big AI LLM labs like OpenAI and Anthropic are working towards Artificial General Intelligence (AGI), Runway is working to build “General World Models.”

“A general world model,” explained Germanidis “ is an AI system that builds an internal representation of an environment, and uses it to simulate future events within that environment.”

“The aim of general world models will be to represent and simulate a wide range of situations and interactions, like those encountered in the real world,” he added.

While Gen-3 isn’t in itself an Open World Model it is the first step, Germanidis told me. “It’s still very early, and this is the first and smallest of our upcoming models”.

“The model can struggle with complex character and object interactions, and generations don’t always follow the laws of physics precisely,” he warned. So don’t get overly excited but remember this is just step one.