I asked Google Bard 3 questions to test its new image capabilities — here's how it went

Google Bard AI
(Image credit: Shutterstock)

Google Bard was definitely a focal point of Google I/O 2023 earlier this month, with the introduction of multimodal functionality grabbing most of the headlines. Multimodal functionality allows Bard to both respond to prompts with images and to process context when provided with images instead of a text-only prompt, and it's all thanks to the PaLM 2 large language model.

As of this writing, Bard can’t yet handle image inputs, but it can deliver image outputs. So I decided to put this feature to the test by asking it three tech questions that any one of our readers could need an answer to right now. Here’s what happened.

Bard Question 1: Where is the charging port on the Macbook Air M2?

Google Bard multimodal testing

(Image credit: Future)

For the first question, I asked Google Bard “Where is the charging port on the Macbook Air M2?” And it did answer my question, saying that “The charging port on the MacBook Air M2 is located on the left side of the laptop, near the Escape key. It is a MagSafe 3 port, which means that it uses a magnetic connector to attach to the laptop.”

Unfortunately, Bard didn’t give me any images for reference — but I quickly realized why. If you look at our tips for using ChatGPT, another popular AI chatbot, the first step is to be specific. Because I didn’t ask Bard specifically for an image, it decided that a text-only response was best. 

Google Bard multimodal testing

(Image credit: Future)

So, I used our second tip — be conversational — and asked Bard “Can you show me this with an image?” And it immediately provided me with an image of the Macbook Air M2 charging port pulled from Apple’s press release for our best laptop. Problem solved.

One note: if you click on the image Bard provides, you'll jump to the image source rather than get an expanded image. There are pros and cons to this method, but just keep that in mind when using Bard and its new capabilities.

Bard Question 2: Which phone takes better photos?

So now that I knew I needed to be specific, I got much more granular when asking Bard which phone takes the better photos. My prompt was, “Which phone takes better photos? The Samsung Galaxy S23? Or the iPhone 14? Show me photos taken by each for comparison.”

Google Bard multimodal testing

(Image credit: Future)

Success! This time I got photos on the first try, and Bard did a pretty good job showing me photos comparing one phone against the other.

Unfortunately, Bard still failed to get its facts completely right. Its first point, “The Galaxy S23 has a 50MP main camera, which gives it more detail than the iPhone 14's 48MP main camera.” is incorrect for a few reasons. 

First, the iPhone 14 has a 12MP main camera — the iPhone 14 Pro is the one with a 48MP camera. Second, one could argue that having more megapixels doesn’t always equal better photo detail. In our iPhone 14 versus Samsung S22 faceoff, our testing showed that Apple's Photonic Engine called out a lot more detail through image processing than you might expect from a 12MP shooter.

So be careful when taking Bard’s answers as gospel — it still gets things wrong. But thanks to the photos all taking you to the original source, you can do follow-up research and correct these mistakes yourself.

Bard Question 3: What are the differences between ChatGPT and Google Bard?

Google Bard multimodal testing

(Image credit: Future)

Finally, I decided to see how well Bard knows itself. I asked it (again, with great specificity) “What are the differences between ChatGPT and Google Bard's user interfaces? Please show images for each difference.”

Being specific paid dividends yet again, as I immediately got a result with images. Bard said that it was the superior interface as it is “more user-friendly and offers more features than the ChatGPT interface.”

Again, Bard doesn’t get this one quite right, though overall I agree it’s the more visually appealing interface. Bard claimed that ChatGPT does not allow you upvote or downvote responses — it does — and that ChatGPT cannot connect to the internet, which is no longer true if you have access to the ChatGPT Plus beta

Bard is better with images — but still flawed 

Make no mistake, despite the fact that Bard still makes mistakes, the AI chatbot is much better now that it has added multimodal functionality. As a research tool, the ability to provide images in addition to text is a significant upgrade and even makes up for some of Bard’s shortcomings when it comes to accuracy.

But if you want to test it for yourself, check out our guide on how to use Google Bard. This guide will have you asking the chatbot your own questions in no time. 

More from Tom's Guide

Malcolm McMillan
Senior Streaming Writer

Malcolm McMillan is a senior writer for Tom's Guide, covering all the latest in streaming TV shows and movies. That means news, analysis, recommendations, reviews and more for just about anything you can watch, including sports! If it can be seen on a screen, he can write about it. Previously, Malcolm had been a staff writer for Tom's Guide for over a year, with a focus on artificial intelligence (AI), A/V tech and VR headsets.

Before writing for Tom's Guide, Malcolm worked as a fantasy football analyst writing for several sites and also had a brief stint working for Microsoft selling laptops, Xbox products and even the ill-fated Windows phone. He is passionate about video games and sports, though both cause him to yell at the TV frequently. He proudly sports many tattoos, including an Arsenal tattoo, in honor of the team that causes him to yell at the TV the most.