OpenAI GPT-4o voice assistant just put Siri on notice with real smarts — and a real personality

Real time translation with ChatGPT
(Image credit: OpenAI)

Holy crap! The new ChatGPT version, named GPT-4o, just got its first live demos at the OpenAI Spring Update, and it makes Siri look downright primitive. 

We're talking about a chatbot that can express real emotion, do real-time translations and use vision features like Google Lens to help you do everything from solve linear equations to help guess your mood. 

Yup, the GPT-4o voice assistant can do real-time conversational speech. In the first demo, we see one of the OpenAI researchers, Mark, ask GPT-4o to help him calm his nerves because he's giving a live demo. The chatbot responds that this is awesome and seems excited and proceeds to guide him through a breathing session.

Mark purposely breathes in heavy and fast, and GPT-4o responds with humor that he's "not a vacuum cleaner." The assistant does get tripped up a bit with the audio being out of sync but overall it's very impressive. Plus, you can interrupt the model, so you can change gears at any time.

Detecting emotions from selfies

(Image credit: OpenAI)

It's important to point out that GPT-4o can perceive emotion. But it can also generate emotion. For example, Mark asked the voice assistant to read a bedtime story and proceeded to give it instructions on being more expressive and dramatic. So you can say "I want a little more emotion in your voice and some drama." 

As a result, GPT-4o reads with a lot more passion, and the assistant could even switch to a robot voice on command. You can even ask it to sing and it complies.

Next up is vision. OpenAI showed GPT-4o a linear math question and the assistant jumped the gun, trying to solve a problem before it was even shown. But even here it showed emotion, saying "whoops, I got too excited." This small moment is a breakthrough, as you can see it's capable of solving its own mistakes and even having a laugh about it too.

Eventually, GPT-4o recognized the equation "3x +1 = 4" and gave hints on how to solve it without giving away the answer. This could make GPT-4o a good homework helper. 

GPT-4o is also smart enough to recognize and analyze code on your PC, and it can even look at a graphs and provide real-time feedback and information.

I was especially impressed by GPT 4o's real-time translation tool. When asked if it could translate a conversation from English to Italian, it said "Perfecto!" and got to work. The assistant was accurate and friendly throughout the process, and I could see it being a great travel tool.

Writing out an equation for ChatGPT to solve

(Image credit: OpenAI)

Last but not least, OpenAI showed how GPT-4o can detect emotion just by looking at your selfie through your phone's front camera. It knew that the person was smiling and even asked "want to share the reason for your good vibes?"

This is just a first taste of what ChatGPT-4o can do and it's already leaps and bounds smarter and more versatile than Siri, Google Assitant/Gemini and Alexa. With Apple reportedly working on Siri 2.0 and Google I/O coming up tomorrow, the pressure is officially on. 

ChatGPT-4o will be rolling out in the coming weeks, and we can't wait to try it out. 

More from Tom's Guide

Back to MacBook Air
Storage Size
Screen Size
Any Price
Mark Spoonauer

Mark Spoonauer is the global editor in chief of Tom's Guide and has covered technology for over 20 years. In addition to overseeing the direction of Tom's Guide, Mark specializes in covering all things mobile, having reviewed dozens of smartphones and other gadgets. He has spoken at key industry events and appears regularly on TV to discuss the latest trends, including Cheddar, Fox Business and other outlets. Mark was previously editor in chief of Laptop Mag, and his work has appeared in Wired, Popular Science and Inc. Follow him on Twitter at @mspoonauer.