Meta just dropped an open source GPT-4o style model — here’s what it means

Adobe Firefly image of a chameleon sitting on a computer chip
(Image credit: Adobe Firefly/Future AI)

Meta publicly released a new family of AI models, called Chameleon, which are comparable to the more commercial tools like Gemini Pro and GPT-4V.

It originally detailed all the models’ nuts and bolts in a paper which shows that Chameleon, which comes with a 7 billion and a 34 billion parameter version, is capable of understanding and generating images and text. 

Chameleon can also process combinations of text and images (which could be related to each other) and generate meaningful responses, Meta says.

In other words, you could take a picture of the contents of your fridge and ask it what you can cook using only the ingredients you have. This is something not possible with the Llama generation of AI models and brings open source closer to the higher profile mainstream vision models from OpenAI and Google.

After the paper’s publication, the Fundamental AI Research (FAIR) team at Meta has now released the model publicly for research purposes, albeit with some limitations.

What's new in Meta Chameleon?

The paper’s authors say that the key to Chameleon’s success is its fully token-based architecture. The model learns to reason over images and text jointly which is not possible in the case of models that use separate encoders for each input.

Technical challenges Meta’s team had to overcome included ones concerning optimization stability and scaling. It did so using new methods and training techniques.

Ultimately for the user, it means that Chameleon should be able to handle prompts that call for outputs with both text and images with ease.

Users could for example ask Chameleon to create an itinerary to experience a summer solstice and the AI model should be able to provide relevant images to accompany the text it generates.

The researchers said that according to human evaluations, Chameleon matches or exceeds the performance of models like Gemini Pro and GPT-4V when the prompts or outputs contained mixed sequences of both images and text. However, evaluations on interpreting infographics and charts were excluded.

'They've progressed significantly'

The model Meta released publicly can only generate text outputs and its safety levels were purposefully increased. 

However, in May, Armen Aghajanyan, one of the people who worked on the project, wrote on X that their models “were done training 5 months ago” and claimed they’ve “progressed significantly since then”.

For researchers, Chameleon represents a source of inspiration for alternative ways to train and design AI models. For the rest of us, it means we’re one step closer to having AI assistants that can understand better the context they’re operating in without having to use one of the closed platforms.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 45 deals
Filters
Arrow
Show more
Christoph Schwaiger

Christoph Schwaiger is a journalist who mainly covers technology, science, and current affairs. His stories have appeared in Tom's Guide, New Scientist, Live Science, and other established publications. Always up for joining a good discussion, Christoph enjoys speaking at events or to other journalists and has appeared on LBC and Times Radio among other outlets. He believes in giving back to the community and has served on different consultative councils. He was also a National President for Junior Chamber International (JCI), a global organization founded in the USA. You can follow him on Twitter @cschwaigermt.

Read more
Meta Llama 3.1
Llama 4 will be Meta's next-generation AI model — here's what to expect
Meta AI logo on a phone
Meta set to release a direct competitor to ChatGPT — here's what you need to know
Sam Altman
OpenAI takes aim at authors with a new AI model that's 'good at creative writing'
Meta AI logo on a phone
Meta AI could take on ChatGPT and Gemini with standalone app launching within months
DeepSeek R1 illustrations
DeepSeek’s Janus Pro AI image generator is here to take on Midjourney and DALL-E
OpenAI logo
OpenAI ChatGPT-4.5 is here and it's the most human-like chatbot yet — here's how to try it
Latest in AI
The Dnsys X1 Exoskeleton being worn
I tested an AI exoskeleton to help treat my immune arthritis — here’s what happened
Squid Game star Lee Jung Jae appearing in an advert for Perplexity
Perplexity just brought in a 'Squid Game' star to convince you to ditch Google
Man and woman side by side lifting dumbbells in a plank position during a weights workout
I tried Gemini's new 'Gems' feature to create my own custom AI fitness coach — here's what happened
Apple Peek Performance
Apple makes a move to revive its Siri revamp — and the Vision Pro boss could play a part
A TV with the Netflix logo sits behind a hand holding a remote
I tried these 7 ChatGPT prompts to supercharge my Netflix viewing experience
AI Madness ChatGPT vs. Gemini round
I just tested ChatGPT vs. Gemini with 7 prompts — here's the winner
Latest in News
Rendered images of rumored foldable iPhone.
Foldable iPhone report just revealed key details — here's what we know
NYT Strands on a cellphone
NYT Strands today — hints, spangram and answers for game #385 (Sunday, March 23 2025)
Nintendo Switch 2
Nintendo Switch 2 rumored specs — here’s what we know so far
iPhone 17 Pro render
iPhone 17 Pro — 7 biggest rumored upgrades
CAD renderings of the Google Pixel 10 Pro XL
Pixel 10 leak could be good news for all Android phones
A magnifying glass on top of the Steam logo in a web browser
Valve just pulled a malicious game demo spreading info-stealing malware from Steam