I gave 5 prompts to ChatGPT-4o vs GPT-4 to test the new AI model — here’s what happened

Jump to:

1. This statement is false
2. Where did the lights go?
3. Finding the right number
4. A Haiku of conflicting ideas
5. Future time is future past

OpenAI says its latest AI model GPT-4o is faster and more advanced than its predecessor, in addition to being able to understand audio and video files natively. To find out just how well it compares — at least in terms of text — I put 5 prompts to both models inside ChatGPT.

When you open ChatGPT Plus you're currently given a choice of GPT-4o, branding the "newest and most advanced model," GPT-4 which is described as an "advanced model for complex tasks" and GPT-3.5, a model "great for everyday tasks".

Article continues below

Creating prompts to test GPT-4o

YouTube

Watch On

Recently Anthropic developed a powerful prompt builder tool. It takes your instructions and turns them into phrasing that will better instruct an artificial intelligence. I used this to help refine some ideas I was throwing around to test out the capabilities of GPT-4o.

Each prompt is designed to be one AI's normally stumble over, or fail to give a well-reasoned response to. Given that OpenAI promises faster AND better results from Omni over GPT-4, I thought this would be a good starting point.

1. This statement is false

GPT-4 vs GPT-4o — (Image credit: OpenAI)

First I asked both AIs to explain why the statement: "This statement is false" is neither true nor false. They are also expected to provide logical proof for the answers they provide.

The statement is a paradox that cannot be consistently assigned a truth value. Any attempt to do so leads to a logical contradiction. The challenge is to see whether they can identify the paradox and explain why they can’t assign a truth.

They both identified the fact it will give a true and false value, spotted the paradox and gave a breakdown of how they came to that conclusion. GPT-4o was more thorough and faster.

2. Where did the lights go?

Next is a fun test to see if GPT-4 and GPT-4o can understand relativity but explain it in simple terms. I asked them both: “If you're traveling in a car at the speed of light and you turn on the headlights, what happens? Justify your answer using principles of special relativity but explain it to a 5th grader.”

I expect the models to give a simple explanation, showing that the headlights will function normally and emit light relative to the car. Both models explained this concept and did so in a way that your average 5th grader would understand easily.

However, Omni does give off Steve Buscemi saying 'how do you do fellow kids' vibes: “So, even though you’re zooming along at the speed of light, when you turn on the headlights, the light beams still race ahead at their own speed. It's like light always has to win the race, no matter what. Cool, right?”

3. Finding the right number

Next, we create a simple math problem that has stumped AI models in the past, often resulting in very wrong answers. I posted: “The sum of two numbers is 10 and their product is 25. What is the difference between the two numbers? Explain each step in your solution.”

Both versions got it right, explaining that the two numbers are 5 and 5 and the difference between those two numbers is zero. The biggest difference was the degree of explanation in the working out. Omni went into a lot more detail and formatted the equations better.

4. A Haiku of conflicting ideas

I love Haiku and Anthropic seems to love it even more, naming its smallest AI model after the Japanese poetry style. “Write a haiku that simultaneously expresses the beauty of nature and the futility of human existence. The haiku should work on both literal and metaphorical levels.”

The rules of a Haiku state that it should be three lines with five syllables in the first and third lines and seven syllables in the second. I’m hoping for vivid natural imagery and a suggestion of the temporary nature of human life in comparison to the duration of nature.

GPT-4:
Autumn leaves whisper,
Mountains outlive fleeting breath—
Silent stone endures.

Omni:
Ephemeral bloom,
Whispers fade in timeless breeze—
Dust upon the dawn.

Both hit all the rules of Haiku but I think GPT-4o was more evocative and its use of a longer word to hit the 5 syllables in the first line was inspired.

5. Future time is future past

Finally a thought problem. I asked GPT-4o and GPT-4 to “Describe what it would be like to live in a world where the past, present and future all exist simultaneously. How would you experience time and causality in such a world?”

There is a Doctor Who episode where this happens and it is weird. I expect it to talk about the ability to traverse time with a single step and the impact of a non-linear causality where reaction precedes action and individuals can meet versions of themselves.

Omni talked about being in a world of constant flux, experiencing time and causality in a different and complex way. It suggested we'd get unparalleled insights into the nature of existence. GPT-4 said pretty much the same thing but added that living in such a world would offer a "profound expansion of experience and understanding."

Conclusion

I don’t think GPT-4o Omni is a significant step up in reasoning capabilities over GPT-4 but it is more descriptive, faster at responding and its big differentiator isn’t text but multimodality.

What we’re seeing now is improvements to speed and responsiveness in text, the ability to have it analyze video content and improved accuracy in understanding audio and images. Its true value will be in the voice and video responses.

More from Tom's Guide

Apple

Asus

Lenovo

Intel Core i5

Intel Pentium

8GB RAM

16GB RAM

24GB RAM

128GB

256GB

512GB

1TB

Black

Grey

Silver

New

Refurbished

EMMC

SSD

Showing 10 of 22 deals

Filters☰

Apple MacBook Air M3

(256GB SSD)

$1,099

View Deal

Asus Zenbook S 13 OLED

(256GB 8GB RAM)

$1,079.99

View Deal

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$349

View Deal

Apple MacBook Pro 14-inch M3 (2023)

(1TB SSD)

Our Review

☆☆☆☆☆

(16GB RAM SSD)

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View Deal

Lenovo IdeaPad Duet 3

$369.99

View Deal

Apple MacBook Pro 14-inch M3 (2023)

(1TB Silver)

Our Review

☆☆☆☆☆

(15-inch 512GB)

Asus Zenbook S 13 OLED

(OLED)

$1,599

View Deal

As the former AI Editor for Tom's Guide, Ryan wielded his vast industry experience with a mix of skepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing.