ChatGPT with GPT-4o — I cannot remember the last time I was this blown away by a piece of technology

GPT-40
(Image credit: OpenAI)

OpenAI outshone Apple during last night's spring update livestream. This happened in terms of hype before the event and the overwhelmingly positive reaction to the products being announced by the team. As CEO Sam Altman said: “It feels like magic”.

The biggest announcement was the model GTP-4o which will power ChatGPT for both paid and free users. Unlike large language models, this is an omnimodal model, capable of taking in anything from text to video and outputting speech, text and even 3D files.

We used to talk about the iPhone moment when Steve Jobs changed the cellphone industry forever, and then in November 2022 we began to talk about the ChatGPT moment. This was another industry-defining product and I think OpenAI has done it again.

I’ve covered a lot of product announcements over a 20+ year career and this is the most exciting I’ve been to try a new product ever. If Altman is to believed, this is only just the beginning

Why is GPT-4o such a big deal?

GPT-4o (or, the Omni model) brings a new way to interact with information. Instead of typing, you can just have a conversation or show it a video and get a voice response without any delay.

This response won’t be the slightly monotone of other assistants or the faux inflections of the previous generation of ChatGPT Voice — it is a natural-sounding voice with laughter, emotion and inflections that react in real-time to your conversation.

The full multimodal features with the ability to talk naturally using speech-to-speech are still being rolled out slowly, but even the chat version — conversing in text and pictures — is faster and more responsive than its predecessors.

Altman wrote in his blog: “Talking to a computer has never felt really natural for me; now it does. As we add (optional) personalization, access to your information, the ability to take actions on your behalf, and more, I can really see an exciting future where we are able to use computers to do much more than ever before.”

What might this future look like?

One day, and probably not as far away as many people think, this technology will power robots that work with us or serve us in our homes.

The small black dot you talk to and that talks back is as big of a paradigm shift in accessing information as the first printing press, the typewriter, the personal computer, the internet or even the smartphone.

These will be robots we can converse with like a friend and ask to do complex tasks and have it both understand and respond. 

Somebody will fall in love with GPT-4o.

Even in the short term, as OpenAI rolls out iPad, iPhone and laptop apps for ChatGPT with voice and vision capabilities we’ll see it take on the role of tutor, coding assistant, financial advisor and fitness coach — and do so without judgment.

What we’re witnessing — and other companies will catch up — is the dawn of a new era in human-computer interface technology.

Omni models don’t require the AI to first convert what you say to text, analyze the text and then convert that back to speech — they understand what we say natively by analyzing the audio, the inflections in our voice and even live video feeds.

The small black dot you talk to and that talks back is as big of a paradigm shift in accessing information as the first printing press, the typewriter, the personal computer, the internet or even the smartphone.

More from Tom's Guide

Ryan Morrison
AI Editor

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?