GPT-5.4 is here — and OpenAI just made every other AI model look slow

ChatGPT Image — (Image credit: Shutterstock)

TL;DR

OpenAI has officially launched GPT-5.4, a new frontier model that consolidates its best reasoning, coding and agentic capabilities into a single package
Faster than GPT-5.2, dramatically better at real-world professional tasks
Capable of controlling computers natively

OpenAI is not having a quiet week. From amending Pentagon deals to managing the PR fallout from a leaked internal transcript, the company appears to be dealing with plenty behind closed doors.

Yet despite the turmoil, OpenAI has just launched GPT-5.4, its most capable and efficient frontier model to date, rolling it out simultaneously across ChatGPT, the Codex platform and its developer API.

For users on Plus, Team and Pro plans, the new model — called GPT-5.4 Thinking inside ChatGPT — begins rolling out today.

This is hardly a minor refresh. GPT-5.4 combines the elite coding abilities of GPT-5.3 Codex with significantly improved reasoning, computer use and knowledge-work capabilities.

Article continues below

The result is a model designed to do real work, actually operating software, analyzing spreadsheets and powering long-horizon agent workflows with minimal hand-holding.

What makes GPT-5.4 different?

Screenshot of GPT-5.4 — (Image credit: OpenAI)

The biggest shift here is the rise of native computer use. GPT-5.4 is the first general-purpose OpenAI model that can take control of a computer — clicking, typing and navigating software using screenshots and mouse/keyboard commands, without relying on a separate specialized model.

Developers can now build agents that actually operate websites and applications, not just generate text about them.

On OSWorld-Verified — the benchmark that measures a model's ability to navigate a real desktop environment — GPT-5.4 scores 75.0%, which not only destroys GPT-5.2's 47.3% score but also edges past the measured human baseline of 72.4%. In other words, this model is already better than the average person at navigating a computer via screenshots alone.

Professional work: where it really shines

Writer typing on keyboard — (Image credit: Shutterstock)

OpenAI says GPT-5.4 is specifically engineered to be better at the kind of work real professionals do every day: building financial models, editing presentations, drafting legal documents and managing complex spreadsheets.

On an internal benchmark of spreadsheet modeling tasks designed for junior investment banking analysts, GPT-5.4 scored 87.5% — up from 68.4% for GPT-5.2. That's a massive improvement for anyone automating financial workflows. Similarly, human evaluators preferred GPT-5.4's presentations over GPT-5.2's 68% of the time, citing stronger visual variety and better use of image generation.

Hallucinations are down significantly. According to OpenAI, GPT-5.4's individual factual claims are 33% less likely to be false than GPT-5.2's, and its full responses are 18% less likely to contain any errors — a meaningful upgrade for professionals who rely on accurate outputs.

Coding: faster, smarter, more visual

screenshot Coding on ChatGPT — (Image credit: OpenAI)

GPT-5.4 now serves as OpenAI's primary coding model too — replacing the need to choose between ChatGPT and Codex for most tasks. It matches or outperforms GPT-5.3-Codex on SWE-Bench Pro while also being faster, especially at lower reasoning effort settings. Within the chat, you can start coding without needing to choose.

A new fast mode in Codex delivers up to 1.5x speed improvement across all supported models. OpenAI also highlights that GPT-5.4 is notably better at complex front-end coding tasks, producing results that are both more aesthetically polished and more functionally correct.

A new experimental feature — "Playwright (Interactive)" — lets Codex visually debug web and Electron apps in real time, even testing apps as it builds them using its native computer-use capabilities.

What everyday users can expect with ChatGPT-5.4

Person typing on laptop keyboard — (Image credit: Unsplash)

For everyday ChatGPT users, the most noticeable change is that GPT-5.4 Thinking now shows an upfront plan before it starts working on complex tasks. You can intervene, redirect or adjust mid-response without starting over — a feature that promises to save significant time on multi-step research or creative projects.

The model can also maintain coherent context across much longer workflows, handling extended conversations and complex prompts without losing track of earlier steps. This is now live on chatgpt.com and Android, with iOS coming soon.

Availability, tool use and agents

ChatGPT logo on iPhone in person's hand

(Image credit: Getty Images)

ChatGPT Plus, Team, and Pro users get GPT-5.4 Thinking starting today. Enterprise and Edu plan users can enable early access via admin settings. GPT-5.4 Pro is exclusive to Pro and Enterprise plans. Developers can access both gpt-5.4 and gpt-5.4-pro via the API immediately.

Perhaps the reason for this level to receive the features first is that GPT-5.4 offers a significant upgrade with tool search: instead of loading every available tool's full definition into context upfront (which can burn tens of thousands of tokens per request), the model receives a lightweight list and looks up specific tools only when needed.

In testing on 250 tasks from Scale's MCP Atlas benchmark with 36 MCP servers enabled, the tool-search configuration reduced total token usage by 47% while maintaining accuracy. For developers building large agentic systems, that translates directly to lower costs and faster response times.

It's clear OpenAI is catering to developers and power users with this rollout.

Final thoughts

GPT-5.4 is a legitimately significant release. Native computer use alone would make it noteworthy — but combined with best-in-class professional knowledge performance, a 1M token context window, and dramatically improved tool efficiency, it represents a meaningful step change for anyone building with or working alongside AI.

Bottom line: This is the model to watch in 2026.

Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.

More from Tom's Guide

TOPICS