AI video just took a big leap forward — Pika Labs adds lip syncing

(Image credit: Pika Labs)

Pika Labs, one of the leading AI video platforms, has added a new feature that can bring voice to generated characters.

Lip Sync was built in partnership with AI audio platform ElevenLabs and lets you give words to people in generated videos and sync their lip movements to the sound.

Film makers wanting to have characters in their generated video holding a conversation would have to accept them not having lip movement, or intersect real actors with generated clips.

Lip Sync changes that. The new tool is a significant moment in the generative AI video space, which itself is barely a year old. I'd argue when properly deployed and initial issues ironed out, it is as big of a moment as the launch of OpenAI's Sora.

What is Lip Sync from Pika Labs

Until now most artificial intelligence generated video clips have been just that, clips showing a scene, a person or a situation. They haven’t had the interactivity of a character speaking to the camera or to someone else on screen.

Without the ability to have realistic characters speaking to the audience most videos have been glorified slideshows or used for music videos.

I've done both, also made fictional trailers for TV shows or commercials — all using voice over rather than giving specific characters a voice in the video.

[New] Trying Pika lip sync. so cool. pic.twitter.com/5N0f9vxhBZFebruary 27, 2024

I haven't tried Lip Sync myself yet, as it's currently only available to users subscribed to the Pro plan or above, but from what I've seen of others generations, it isn't perfect but very close to being production ready. At the very least it will present a cheap way to get a pilot off the ground quickly.

The feature can take text-to-audio with the voice provided by ElevenLabs, or a direct audio upload if you've already got your own sound — such as a podcast or book.

Similar functionality is already available from tools like Synthesia but that has a more enterprise customer service focus and generates talking heads rather than characters.

Why is Lip Sync in AI videos a big deal?

🫶 pic.twitter.com/Rc6TDxrrc6February 27, 2024

Runway and Pika Labs have been the dominant platforms for true generative video for the past few months. Early to market and iterating quickly, with Runway revealing its synthetic voice-over service last year — but not synched to video.

Competition is starting to heat up though with all the big players exploring generative video and OpenAI revealing its very impressive Sora AI video platform.

StabilityAI also has a new version of Stable Video Diffusion and Leonardo is offering motion for any of its AI generated images. Google has Lumiere and Meta has Emu, forcing the early players to add new features before everyone else catches up.

What comes next?

Up until now we've seen silos in generative AI. Tools that make images, tools that create videos, services for writing a script and something else to add sound. The next step will be greater levels of convergence, with platforms emerging offering full end-to-end production from a simple text prompt.

ElevenLabs is also working on a sound effects library, and combined with Suno we could soon see a single platform where you can say "take this script written by ChatGPT and turn it into a short film".

A few minutes later you'd have a timeline with a series of videos, parts spoken by characters using ElevenLabs synthetic voices and appropriate sound effects and music playing to bring the full production to life.

There was concern we'd see AI turn into Skynet and control our lives, but the evidence (so far) seems to suggest it just wants to entertain.

More from Tom's Guide

Back to MacBook Air

Apple

Asus

Lenovo

Intel Core M3

Intel Pentium

8GB RAM

16GB RAM

128GB

256GB

512GB

1TB

Black

Grey

Silver

New

Refurbished

EMMC

SSD

Showing 10 of 36 deals

Filters☰

Apple MacBook Air M2 2022

(13.6-inch 256GB)

$999

$889.95

View

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View

Lenovo IdeaPad Duet 3

$369.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB Silver)

Our Review

☆☆☆☆☆

$1,799

$1,299

View

Apple MacBook Air M2 2022

(512GB SSD)

$1,499

View

Asus Zenbook S 13 OLED

(OLED)

$1,599

View

Lenovo IdeaPad Duet 3

(EMMC)

$481.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB Intel Core M3)

Our Review

☆☆☆☆☆

$2,399

$2,098.98

View

Apple MacBook Air M2 2022

(256GB)

$959.99

View

Asus Zenbook S 13 OLED

(OLED)

$1,599

View

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?

Recommended reading

AI video just took a big leap forward — Pika Labs adds lip syncing

What is Lip Sync from Pika Labs

Why is Lip Sync in AI videos a big deal?

What comes next?

More from Tom's Guide

Please wait...