People are getting excited about using the latest version of OpenAI’s image generator, DALL-E 3, right inside Bing Chat. After putting it through three different tests, now I understand why.
You’re probably familiar with ChatGPT, the AI chatbot that’s also part of OpenAI’s portfolio. Paid-up Plus and Enterprise users of this chatbot are able to use it to access DALL-E 3. But, if you’re looking for a free way to try out the image generator for yourself, you can do so using Bing Chat and Bing Image Creator. These Microsoft products are now also powered by DALL-E 3.
Going down the Microsoft route gets you a limited number of so-called “boosts”. Having a boost means your image processing goes faster. If you use up all your boosts, expect to wait longer for Bing to generate your images. You can get more boosts through Microsoft’s Rewards loyalty program.
DALL-E 2 already ranks among the best AI image generators out there. But its latest upgrade allows it to understand significantly more nuance and detail.
“Modern text-to-image systems have a tendency to ignore words or descriptions, forcing users to learn prompt engineering. DALL·E 3 represents a leap forward in our ability to generate images that exactly adhere to the text you provide,” said creator OpenAI.
Bold words, but we’ll be the judge of that! I designed a three-tiered test testing out artistic capabilities, life-like creations, and a trial of crisp and legible text generation. Could Bing Chat overcome the challenge?
First test: the birth of art
DALL-E 3 is set up to decline prompts that ask for images in the style of living artists. Fair enough. I played it safe and went with a prompt for cave art featuring five humans discovering the world’s first laptop.
In my head, I pictured a group of stick figures aiming a pointy stick at a square. Bing Chat said I’ll do you one better.
Second test: How real can you go?
For creating humans with more fingers than you can count, you can pick any run-of-the-mill image generator. Could Bing Chat deliver an image of the CEO of a new tech company? One that could pass as an ordinary LinkedIn profile picture?
Ish? Sure, the output was decent but the image felt more like a realistic 3D render than an actual photo. I pointed that out to the chatbot, but the second attempt was just more of the same. Some of the facial features seem more natural than what I’ve achieved with other image generators, but it’s not enough to convince me that Bing Chat deserves the gold medal.
Final test: A fresh logo with text
Ok, this part of the test had me the most excited. I’ve all but given up on seeing legible text in AI-generated images but I’ve seen examples created by DALL-E 3 that got me itching to give logo design a whirl.
I switched to Bing Chat’s creative mode. The brakes were off.
I requested a new Tom’s Guide logo, one that was both modern and minimalistic. It had to feature something that symbolized the evolution of technology. The most crucial part: the words “Tom’s Guide” had to appear in a crisp and legible way underneath the image portion of the logo.
The first set of logos featured a cliché row of evolving figures with mixed results for the text. I asked for some revisions based on these observations. Was I being too harsh? If so, Bing Chat didn’t show any signs of resentment towards me as it delivered a second set of logos.
There was some improvement based on my suggestions but what I enjoyed the most was experiencing the ability to chat with the image generator as though I was speaking with a graphic designer.
All in all, I feel that creating AI images is still a numbers game. You have to try a mix of image generators and throw enough tweaked prompts at them until something sticks. Should you include Bing Chat in your attempts? Definitely.