Online apps like Dall-E, Midjourney, and Stable Diffusion allow us to create images (often quite good ones) from just a series of prompts. Video is the next frontier in generative AI art.
We're not at the point where we can simply describe a story and watch AI spin out a movie or TV series — or even create animations approaching the quality of still-image generators. But AI can produce compelling shortform content for entertainment, education, training, or marketing by finding and assembling videos, images, and music, then generating narration — all based on your script. Some apps can even write the script for you.
There's no shortage of apps providing these capabilities. We narrowed our initial list of 10 down to the best five, based preliminary tests, features, popularity among users, and price options.
What makes the best AI video generators?
The best app does the most work for you. This starts with picking the right footage, based on your script, from their clip libraries. We tested this by writing a script for an educational video about the ancient city of Petra in Jordan (a UNESCO World Heritage site and popular movie setting) that ran between one and two minutes, depending on the app. (For the apps that support it, we also tested their ability to generate a full script based on the prompt "create an educational video about petra jordan including the five top sites.") We used Google Chrome for all tests.
Even the best one guessed right only half the time. If the word "Petra" was in a line of the script, the apps generally found a fitting clip for the scene. They had a harder time keeping context--"remembering" to find a scene of Petra for parts without that word (even though it was in the title of every video we generated).
The other AI generation features largely came down to taste: You may or may not like the music and voices the app selects or the font it chooses for subtitles. However, some programs had more natural-sounding AI voices than others.
All these projects require considerable tweaking and refinement, so their tools for this are critical. How big are the libraries of videos, photos, music, and voices? And how precise are the search tools for finding exactly what you're looking for? (These apps also allow you to upload your own visuals and audio and record a human voiceover.)
The best apps also provide full, intuitive editing control--allowing you to easily adjust audio levels, set the length of clips, change the output dimensions and resolution, tweak the voices, and crop, pan, and zoom in scenes. If you get stuck, some provide online support (one of them available 24/7) to talk you through your project.
Price is also critical. All these apps have free versions, but some are so restricted--with watermarks, resolution caps, length limits, and paucity of media assets--that they are barely usable. Beyond that, they move to subscription plans (monthly or annual). Sometimes the first tier (around $25) offers a rich assortment of features; other times, companies really push you to upgrade to a higher level. We evaluated all apps based on their first paid-tier offerings (priced for a month-to-month, rather than annual, subscription).
|Product||Monthly cost||AI script writer||Auto clip-selection success rate||Online support|
|Fliki||$28||Yes||33%||Chat (not realtime)|
|Invideo||$30||Yes||45%||Chat (realtime 24/7)|
Each review shows both the entirely automated video the app created as well as a tweaked version with improved visuals (including original clips and photos of The Siq, Street of Facades, and closing scenes), and often different music, voices, and scene length.
InVideo's editing interface displays the full script on the left, broken into the associated clips. It provides an intuitive overview — especially for a video that was generated directly from that script. A preview of the video appears on the right, with an extensive context-sensitive toolbar. Image adjustment appear when you click on the video preview, and text-formatting tools appear when you click on the script. You can also switch to a traditional timeline view with the video and multiple audio tracks stacked for individual editing.
The app churned out a whopping 15 scenes from our script because (like Designs.AI) it creates a new one for every sentence, and it can't be set to use line breaks instead. (It also generated a passable script on its own, based on our prompt.) InVideo selected a fitting video clip for five of those 11 scenes — a better-than-average success rate. You can choose from a whopping 5000 templates for your video — which can be overwhelming, especially because many look rather similar; and there aren't good search filters. (You can also create your own custom templates.)
InVideo has several image-enhancement extras, though some are a bit gimmicky. You can overlay emojis or icons, or apply masks that cut out a portion of the video to view. You can also upload your own corporate logo. And InVideo provides generative image creation. (We got a fanciful image from the prompt "map of ancient middle east.")
InVideo provides a rich selection of audio clips. Music can be filtered by 10 categories such as Angry, Relaxing, Love, Epic; or 20 genres such as Rock, Chillout, World, Horror, and Ambient. It also provides 63 categories of sound effects, like Alarms, Bells, and Birds.
If you get stuck using any of these features, you can take advantage of InVideo's 24/7 online chat support (the most generous offering in this roundup).
Fliki offers a crisp, intuitive interface. Your script appears on the left, broken into scenes, with a thumbnail of the associated video clip. A popup menu on each script item allows you to make extensive edits, such as moving the section up or down in the order or changing voice, video clip, or scene duration. A preview of your resulting video appears on the right.
Fliki can generate content in 78 languages — the most of all the apps we reviewed — with an inclusive roster including Afrikaans, Amharic, Azerbaijani, Basque, Khmer, Maltese, Sinhala, Swahili, and Zulu. It also offers a huge selection of voice styles — 85 for English alone.
Several support up to 11 voice styles, such as Cheerful, Excited, Hopeful, Sad, Terrified, and Whispering. Users on the $88/month premium plan can also clone their own voice for text-to-speech generation. For our custom creation we chose "Jane," with a friendly style. She's a bit of a fast talker, though, and the app only allows you to adjust the speed on some voices (excluding Jane).
Fliki found suitable video on just 4 of the 11 scenes — a middling rate. On several, it didn't even try — just showing a colored screen. Uploading our own footage was frustrating because Fliki doesn't allow you to select what portion of the raw footage appears. If, for instance, you upload a 20-second video to use in a 10-second scene, you have no choice but to use the first 10 seconds of that uploaded video. Otherwise, you'd have to edit the clip to that portion in another app before uploading.
We inquired about some of these limitations using Fliki's online chat support, which provides courteous, detailed answers within a few hours.
Fliki can generate a script from a simple prompt, such as "create an educational video about petra jordan including the five top sites," and the text was passable. But Fliki forces you to choose from just three durations for the video: 1, 5, or 10 minutes.
At $23 (paying monthly) Pictory offers the lowest price of the services. But it also had the lowest success rate finding video clips, getting just three of 11 scenes right. (You can opt for it to autogenerate scenes for each new sentence, paragraph, or both.) For a line about Petra beginning, "It was a significant trade center," Pictory chose a clip from the interior of a commuter rail station located near the World Trade Center in New York City. The interface allows you to highlight key terms in your script to help Pictory better guess what you're looking for, but that didn't seem to help.
Fortunately, Pictory has great tools for finding better clips. Its video library suggests genuinely relevant search terms, and it offers granular capabilities. For instance, we were looking for a clip of a building in Petra known as The Monastery. Just typing in "petra mon" brought up detailed suggestions such as "petra monastery interior" and "petra monastery steps."
Pictory also has great audio search, allowing you to filter tracks by over 60 moods, over 70 purposes (such as heroic, sci-fi, or game show), over 110 genres, and by any duration up to 20 minutes. Pictory didn't generate a voiceover by default. But it was easy to select a pleasant voice, and the app allows you to adjust its rate of speech.
The interface has the script-on-left/preview-on-right layout we liked with apps such as Fliki. It's easy to tweak scenes such as setting their length or looping the background audio so it doesn't time out. Trimming the raw video clips was a bit awkward though. You drag sliders in a popup interface to set the beginning and end points and the overall length to appear in the scene. But there is no "OK" button in the interface, and clicking the "X" button cancels your changes. So you have to simply click somewhere outside the popup.
While Pictory allows you to create and download videos with its free service, every clip has a watermark.
Visla offers the most usable free plan of all the apps we reviewed. It allows you to easily select free stock clips, which appear with no watermarks; and it outputs at high resolutions such as 1080p for the widescreen aspect ratio. It likewise breaks audio clip selections into free and paid. And it allows 100,000 characters of AI voice generation.
The one downside is that free videos feature a Visla-branded splash screen at the end (which you can trim off the downloaded video using even a basic free app like QuickTime Player). The Premium plan, which removes branding and increases allotments (such as unlimited video creation) is the second-cheapest paid tier we reviewed, at $24 (monthly).
Free or paid, Visla was the most accurate at finding video in our tests, succeeding at six of 11 scenes (still not amazing). You can set it to generate a new scene for every sentence or paragraph. Visla can also generate a passable script from a basic prompt, and you can set parameters for the type (such as marketing, technical, or inspirational) and tone (such as professional, relaxed, or witty). You can also order up videos directly in ChatGPT using the new Visla plugin.
However, Visla also has some significant limitations. Top of the list: It supports only English for script-to-video generation and for voiceovers. It also has just 23 voice options, though we found some pleasant ones. There are also some limitations in editing. The tool for changing the length of a scene didn't work in our testing. And we found no way to zoom into a squarish still photo that we used for one of our scenes, so it appeared with blue bars on each side.
Inmagine's Designs.ai is a suite of creation tools. In addition to video, it provides AI-generation modules for text, logos, and designs such as ads or brochures. If your needs extend to any of these other areas, the $29/month price starts looking more reasonable. Its ability to generate content in 27 languages and its choice of 15 interface languages could also broaden the appeal. But, as a video-solution alone, Designs.ai is not your best bet.
The app was mediocre in finding good video content, succeeding in just five of sixteen scenes. (We got so many scenes because the app generates a new one for every sentence in the script; some apps allow you to decide whether each sentence, each paragraph, or both triggers a scene change.) Some of its failures were quite funny, like choosing an orchestra playing to illustrate the line: "Its intricate design reflects the artistry and sophistication of the Nabateans." That said, none of these apps are great in the scene-guessing game. Users on the free plan can generate a video using the full library of clips, but they can't actually download a movie including premium clips without upgrading, and the free clip selection is quite limited.
The voices included in the free and base paid plan sound rather stilted, and accessing "premium" voices requires upgrading to the $69/month Pro plan. The app did automatically set a voiceover volume level that was easy to hear over the auto-selected background music.
Designs.AI was the only program to glitch when we uploaded our own videos, often reporting that they were missing or defective. It took logging out and restarting Chrome to remedy the problem. Even then, we found no way to adjust or mute the volume on the clips — leaving some unwanted noise and chatter in our final movie. The app is also a tad sluggish to load a video project.
Want to know more about using AI for creative work? Here's our breakdown of the 5 best AI image generators.