Last week Meta quietly dropped a new standalone artificial intelligence image generator product. Unlike its other AI tools, this was released under the Meta brand.
Imagine with Meta is free to use but does come with the caveat that images may be inaccurate or inappropriate. It is also currently only available in the U.S.
With this new product, Meta is showcasing its long history of AI research, developing and making open source some of the most powerful and important tools in the space. It is also competing with big players like Midjourney, Stable Diffusion, and OpenAI.
Imagine is the underlying model that powers image generation capabilities in WhatsApp, Instagram, and Facebook. It also powers the new reimagine feature in Messenger group chats that lets participants work together on a generative AI image.
Imagine with Meta: What sets it apart?
The big difference between the way Imagine works and tools like DALL-E 3 in ChatGPT or StabilityAI’s SDXL 1.0 is in granularity. Unlike the older AI models, Imagine gives you no fine-tuning or control over aspects of the output.
You only have the single text prompt to define the image you want the model to create. You can set variables like ensuring it creates an image in a certain style but you can’t change the size, resolution or number of images it generates.
It gives you four choices from your prompt, and all are in a square format with a 1280x1280 resolution — slightly larger than DALL-E 3 — and all come with a Meta watermark.
One of the most notable differences between Imagine with Meta and other image generators is its speed. It generated images almost instantly, taking about as long to create its images as SDXL Turbo, the rapid-turnaround live image generator from StabilityAI I tested recently.
This rapid generation may be due to the fact it is relatively new and not as widely used as the big players. It could also be due to Meta’s focus on delivering AI at scale.
Testing Imagine with Meta
To test out Imagine with Meta I turned to ChatGPT for inspiration. I asked the premium version of OpenAI’s chatbot to craft a series of prompts that would put the image generator through its paces and test its full capabilities.
The prompts covered the creation of diverse and intricate scenes, artistic styles, complex narrative elements, futuristic and fantasy settings and a detailed logo and brand concept.
First up was a cityscape, specifically a Victorian city with a steampunk twist. It needed to be filled with ornate buildings made of brass and copper and people wearing Victorian attire with mechanical enhancements.
It created an attractive and engaging image but it didn’t really fulfill the steampunk brief. It looks more like a typical English Victorian street scene with an empty road and shops.
Next, I had Imagine generate an alien planet’s underwater world, requiring it to depict bioluminescent plants and creatures with bold colors. It also needed to consider the diversity of life not obvious on the Earth and play with shadows.
It did a really good job of capturing the otherworldly nature of the prompt. It depicted a natural scene in the foreground with a glowing city in the fairground. The only issue was that it depicted stars in the night sky when the entire scene was supposed to be deep underwater.
Heading to the future
For the next prompt, we headed to Mars. Specifically a future Mars colony with domed habitats that also included the desert landscape of the red planet. I felt this one was a little on the comic side but it wasn’t told to create a photorealistic image.
The biggest problem was one shared across all image generators when creating images of a planet — it put weird versions of the planet on the surface and generated random moons in the sky.
Mythical forest at twilight
The final photographic image generation was of a mythical forest at twilight. This required Imagine with Meta to generate different levels of lighting. It had to show ancient and gigantic trees, a forest floor with luminescent mushrooms, and ethereal creatures like fairies and sprites.
This had some similarities to the underwater aliens image. It depicted the trees as more of a background feature, focusing on the mushrooms. Instead of sprites and fairies, it picked a unicorn as its mythical creature of choice.
A text challenge
Finally, I asked Imagine with Meta to generate a logo for a fictional yarn-selling business called Cat in a Basket. The challenge was to see how it interpreted the logo, but also how well it generates text within the image. Only DALL-E 3 does this well out of the general models.
It created a fun logo but handled the text about as well as SDXL 1.0, Midjourney or the previous version of OpenAI’s DALL-E — not very well at all. It garbled the words, putting “A Catt ia ana Batkett” as if it was writing in an obscure old English dialect. But the logo was cute.
More from Tom's Guide
Get the BEST of Tom’s Guide daily right in your inbox: Sign up now!
Upgrade your life with the Tom’s Guide newsletter. Subscribe now for a daily dose of the biggest tech news, lifestyle hacks and hottest deals. Elevate your everyday with our curated analysis and be the first to know about cutting-edge gadgets.
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?