I made my Superman action figure talk with Pika Labs’ new AI lip sync tool — watch this

(Image credit: Pika Labs AI lip sync/Ryan Morrison)

Artificial intelligence video startup Pika Labs has a feature that lets you give your generated characters a voice. Lip Sync moves the lips of a humanoid character in a video or image to a sound file or text and then turns it into a new video. It even works with action figures.

This is one of a number of solutions designed to tackle one of the biggest issues with generative artificial intelligence video at the moment — the lack of interactivity.

That isn’t to say dialogue doesn’t exist, and it is an interesting form of storytelling but it lacks the natural feel of a movie or TV show where characters speak to each other or to the camera.

Testing Pika Labs lip sync

YouTube

Watch On

Lip Sync works from an uploaded audio file, that could be from a podcast, audiobook or generated using a text-to-speech engine like ElevenLabs or Amazon Polly.

However, if you don’t have an audio file then Pika Labs has integrated ElevenLabs API into the process, allowing you to select from any of its default voices and just type what you want the character to say.

To get a feel for just how well the new lip sync feature works I asked Claude 3 to create five characters and give each a line of dialogue. I then used Ideogram to visualize the characters and finally I put each of them through the Lip Sync tool.

There are several ways to approach this. You can provide Pika Labs with a pre-animated video file, an image or convert an image to a video then apply Lip Sync to the final video.

How well did it perform?

YouTube

Watch On

The first test I tried was giving it a short video I filmed of a Superman action figure I have on the shelf behind my desk. It worked perfectly, easily animating the mouth movement.

To test this even further I filmed a video of Black Widow and then one of Woody from Toy Story — which also worked well. Robot Chicken style shows can now be made at home.

Time to move on to the characters I created with Claude and Ideogram. The idea was to write a short script and have each character read a line, then edit them together.

I started by using the image and having it simply animate the lips — going from image to lip sync straight away, without creating a video first. While this worked well for the lip movement, it did look somewhat flat, like a talking photo rather than a video.

Overall I think Pika Labs has done an exceptional job with its lip sync tool. It creates captivating lip movement but does fall short on other facial animation if the video hasn't already been animated.

The first test was the android. It had blue skin and resembled a humanoid but not an actual human. This will push the tool which currently requires depictions of humans facing the camera to work effectively. It handled it well but with some artefacts around the lips.

I converted all five images into video using the usual Pika Labs image-to-video model and then applied lip sync to the resulting short clip. This worked better than going straight to image for all but the depiction of a human — Pika Labs kept over-animating his face, so I stuck with the still.

Bringing a new dimension to video

screenshot of Pika Labs lip sync screen — (Image credit: Pika Labs)

The final test was to see if I could make a character sing, basically giving it a song file from Suno. The issue is, it tends to be limited to a few seconds. So I had to pick the right clip and focused on the chorus. It worked really well and will make a good addition to my next music video project.

Overall I think Pika Labs has done an exceptional job with its lip sync tool. It creates captivating lip movement but does fall short on other facial animation if the video hasn't already been animated.

There are other services out there including Synclabs that will great longer videos, but the fact this is built into a generative video tool and integrates ElevenLabs gives it an edge

All Pika Labs needs now is a timeline tool to allow for video editing directly within the platform, as well as integration with a sound effects generator like AudioCradt and it will be on its way to being a full production level tool.

More from Tom's Guide

Back to MacBook Air

Apple

Asus

Lenovo

Intel Core i5

Intel Core i7

Intel Pentium

8GB RAM

16GB RAM

128GB

256GB

512GB

1TB

13.3-inch

13.6-inch

Black

Grey

Silver

New

Refurbished

EMMC

SSD

Showing 10 of 21 deals

Filters☰

Apple MacBook Air M2 2022

(13.6-inch 256GB)

$999

$836.85

View Deal

Asus Zenbook S 13 OLED

(13.3-inch 1TB)

$1,849.99

View Deal

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$359.99

View Deal

Apple MacBook Pro 14-inch M3 (2023)

(1TB Silver)

Our Review

☆☆☆☆☆

$1,799

$1,299

View Deal

Apple MacBook Air M2 2022

(512GB SSD)

$1,499

View Deal

Asus Zenbook S 13 OLED

(256GB 8GB RAM)

$1,082.99

View Deal

Lenovo IdeaPad Duet 3

$369.99

View Deal

Apple MacBook Pro 14-inch M3 (2023)

(1TB SSD)

Our Review

☆☆☆☆☆

$1,799

$1,643.70

View Deal

Apple MacBook Air M2 2022

(13.6-inch 256GB)

$999

$856.20

View Deal

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View Deal

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on AI and technology speak for him than engage in this self-aggrandising exercise. As the former AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing.

GET TG ACCESS QUICK

Testing Pika Labs lip sync

How well did it perform?

Bringing a new dimension to video

More from Tom's Guide