AI video just got personal — Mochi-1 lets you train your own model on a few videos

Genmo Mochi-1
(Image credit: Genmo Mochi-1/AI generated)

Genmo, the San Francisco-based open-source AI video lab, has just announced a new add-on for its Mochi-1 state-of-the-art video generation model.

The new fine tuner tool lets users customize their video output as they want, by seeding the target video with a modest number of additional training clips. The ability to fine-tune video output like this is not new, but this is the first time we have seen it released into the market as an open-source video product.

The tuning is done using standardized LoRA technology, which has long been used to fine-tune image models in order to create a desired output.

By using low-rank adapters like this, users can take a generalized model and customize it to taste. One example could be in creating product images with a specific logo and having them appear in videos.

What is Mochi-1?

Genmo Mochi-1

(Image credit: Genmo Mochi-1/AI generated)

Mochi 1 caused quite a stir when it launched because of its superb quality video output, so this latest development is a significant milestone in the race towards cinema quality, versatile video output. It is available from the Genmo website.

As with many recent AI announcements, the Mochi 1 fine-tuner is less of a mass-market product and more of a research experiment. This early iteration, while designed to work with one graphics card, will only work on systems with an expensive top-end graphics processor with at least 60 gigabytes of VRAM. Which will immediately put it out of the reach of ordinary mortals.

The launch demo suggests that you’ll need no more than a dozen video clips to fine-tune the model to your needs, which is a pretty impressive feat for video. But interested parties will also need to have a good deal of familiarity with coding and command line interfaces in order to get the system working. Not for the faint of heart, therefore.

The value of open-source

Genmo Mochi-1

(Image credit: Genmo Mochi-1/AI generated)

Open-source video seems to be the flavor du jure, judging by the amount of announcements coming down the pipe. The Allegro-T12V model dropped this week, another open-source video technology that holds promise.

It gives six seconds of 720p video from a text prompt, but the key feature is it all happens inside 9GB of VRAM, which sounds like an excellent use of space.

Again, there’s no fancy wrapper to make it easy for end-users at the moment, but hopefully, it will come soon.

In the meantime, I’m just gonna sit here with my jumbo box of hot buttered popcorn and keep looking towards the door for the arrival of Sora. Whatever happened to Sora? Anybody know? Any guesses? Sam? Any one?

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Storage Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Nigel Powell
Tech Journalist

Nigel Powell is an author, columnist, and consultant with over 30 years of experience in the technology industry. He produced the weekly Don't Panic technology column in the Sunday Times newspaper for 16 years and is the author of the Sunday Times book of Computer Answers, published by Harper Collins. He has been a technology pundit on Sky Television's Global Village program and a regular contributor to BBC Radio Five's Men's Hour.

He has an Honours degree in law (LLB) and a Master's Degree in Business Administration (MBA), and his work has made him an expert in all things software, AI, security, privacy, mobile, and other tech innovations. Nigel currently lives in West London and enjoys spending time meditating and listening to music.