I put 5 of the best AI image generators to the test using NightCafe — this one took the top spot
All your favorites in one place

Competition in the AI image generator space is intense, with multiple companies like Ideogram, Midjourney and OpenAI hoping to convince you to use their offerings. That is why I'm a fan of NightCafe and have been using it for a few years. It has all the major models in one place, including DALL-E 3, Flux, Google Imagen and Ideogram.
I've created a lot of AI images over the years and every model brings something different. For example, Flux is a great general purpose model in different versions. Imagen 4 is incredible for realism and Ideogram does text better than anything but GPT-4o.
With NightCafe you can try the same prompt over multiple models, or even create a realistic image of say a train station using Google Imagen, then use that as a starter image for an Ideogram project to overlay a caption or stylized logo. You can also just use the same prompt over multiple models to see which you prefer.
NightCafe also offers most of the major video models including Kling, Runway Gen-4, Luma Dream Machine and Wan 2.1. For this test we’re focusing on image models.
Picking a favorite model
Having all those models to hand is a great way to test each of them to find the one that best matches your personal aesthetic — and they’re each more different than you think.
As well as the 'headline' models like Flux and Imagen, there are also community models that are fine-tuned versions of Flux and Stable Diffusion. For this I focused on the core models OpenAI GPT1, Recraft v3, Google Imagen 4, Ideogram 3 and Flux Kontext.
I’ve come up with a prompt to try across each model. It requires a degree of photorealism, it presents a complex scene and includes a subtle text requirement.
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
The prompt: “A small independent coffee van parked on a quiet cobblestone street in Paris during early autumn, captured in candid 35mm street photography style with natural light and shallow depth of field. Golden morning sunlight reflects off the damp stones after a light rain. The van is a matte forest green Citroën Type H, with a hand-painted chalkboard sign leaning against it that reads “Café du Matin” in elegant cursive. A barista in a denim apron hands a coffee to a smiling elderly woman in a beige trench coat holding a small umbrella. Fallen leaves gather near the tyres, and gentle steam rises from takeaway cups on the wooden counter.”
1. Google Imagen 4
Google’s Imagen 4 is the model you’ll use if you ask the Gemini app to create an image of something for you. It's also the model used in Google Slides when you create images.
This was the first image for this test and while it captured the smoke rising it emphasised it a little. It did create a visually compelling scene and followed the requirement for the two people in the scene. It captured the correct vehicle but there’s no sign of the text.
2. Flux Kontext Max
Black Forest Labs Flux models are among the most versatile and are open source. With the arrival of the Kontext variant, we got image models that also understand natural language better. This means, a bit like OpenAI’s native image generation in GPT-4o, it gives much more accurate results, especially when rendering text or complex scenes.
Flux Kontext captured the 'Cafe Matin' perfectly, got the woman right and it somehow feels more French than Imagen but I don't think it's as photographically accurate.
3. OpenAI GPT Image-1
GPT Image-1, not to be confused with the 2018 original GPT-1 model, is a multimodal model from OpenAI designed for improved render accuracy, it is used by Adobe, Figma, Canva and NightCafe. Like Kontext, it has a better understanding of natural language prompts.
One downside to this model is it can’t do 9:16 or 16:9 images. Only variants of square. It captured the truck and the name, but I don't think the scene is as good. It also randomly generated a second umbrella and placement of hands feels unreal.
4. Ideogram v4
Ideogram has been one of my favorite AI image models since it launched. Always able to generate legible text, it is also more flexible in terms of style than the other models. The Ideogram website includes a well designed canvas and built-in upscaler.
The result isn’t perfect, the barista leans funny but the lighting is more realistic, the scene is also more realistic with the truck on the sidewalk instead of the road. It also feels more modern and the text is both legible and well designed.
5. Recraft v3
Recraft is more of a design model, perfect for both rendered text and illustration, but that doesn’t mean it can’t create a stunning image. When it hit the market it shook things up, beating other models to the top of leaderboards.
I wasn’t overly impressed with the output. Yes, it's the most visually striking in part thanks to the space given to the scene. But it over emphasises the smoke and where is the barista? Also for a model geared around text — there’s no sign writing.
My favorite: Flux Kontext Max
While Flux had a number of issues visually, it was the most consistent and it included legible sign writing. If I were using this commercially, as a stock image, I’d go with the Google Imagen 4 image, but from a purely visual perspective — Flux wins.
What you also get with Flux Kontext is easy adaptation. You could make a secondary prompt to change the truck color or replace the old lady with a businessman. You can do that in Gemini but not with Imagen. You’d need to use native image generation from Gemini 2+.
If you want to make a change to any image using Kontext, even if it wasn't a Kontext image originally, just click on the image in NightCafe and select "Prompt to Edit". Costs about 2.5 credits and is just a simple descriptive text prompt away.
Final thoughts on NightCafe
I used the most expensive version of each model for this test. The one that takes the most processing time to work on each image. This allowed for the fairest comparison. What surprises me is just how differently each model interprets the same descriptive prompt. But it doesn't surprise me how much better they’ve all got at following that description.
What I love about NightCafe though, is its one stop shop for AI content. It isn’t just a place to use all the leading image and video models, it contains a large community with a range of games, activities and groups centered around content creation. Also, you can edit, enhance, fix faces, upscale and expand any image you create within the app.
More from Tom's Guide
- The ChatGPT job hunt hack no one’s talking about — but it actually works
- These 7 ChatGPT prompts rewired how I think — now I use them every week
- 5 hidden ChatGPT tricks most people don’t use — but they’re actually game changers













Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.