OpenAI confirms AI agents are coming next year — what it means for you

Sam Altman CEO of OpenAI — (Image credit: Getty Images)

OpenAI is on target to launch ‘agents’ next year. These are independent artificial intelligence models capable of performing a range of tasks without human input and could be available in ChatGPT soon.

During its first OpenAI DevDay event in San Francisco, CEO Sam Altman said “2025 is when agents will work,” and the company demonstrated an early example of the potential capabilities of agents by having a voice assistant make a call and order strawberries on its own.

The company says there are five stages to Artificial General Intelligence (AGI) and we are currently at stage two, where AI can reason through an idea before responding. Agents is stage three and means AI is smart enough to reason through an idea and as part of planning its response can go off and perform actions independently.

What is the point of AI agents?

OpenAI Realtime API makes a call to order strawberries at Dev Day, which is awesome... but the response latency is ~2s (cutting-edge is <400ms) and the voice doesn't feel as good as "advanced voice mode", it's still devoid of emotions. (from @swyx) pic.twitter.com/4S3MOMiMZ6October 1, 2024

Building useful and functional agents is something every AI lab is working towards. For example, it would allow the AI to not only write a book but go off and work out how to self publish, including signing up for an account with Amazon to share it on Kindle Direct.

People will ask an agent to do something that would have taken them a month, and it'll take an hour.
Sam Altman, OpenAI CEO

Agents are a necessary step on the path to AGI as it will need to be able to carry out tasks it feels are needed to achieve its goal. Altman said during Dev Day that "if we can make an AI system that is better at AI research than OpenAI is, then that feels like a real milestone."

Getting to that stage involves continuously building on previous generations of AI. Altman said that the o1 models will be what make agents actually happen and when people start to use the agents it “will be a big deal,” adding that “People will ask an agent to do something that would have taken them a month, and it'll take an hour.”

He predicts people might have one agent performing specific tasks, and another agent on different duties until they scale up to 10 or 100 agents that can take over various aspects of daily duties. We have already seen some element of how this might play out in watching o1 reason through ideas and offer suggestions.

Alignment is the biggest blocker to agents

Today at DevDay SF, we’re launching a bunch of new capabilities to the OpenAI platform: pic.twitter.com/y4cqDGugjuOctober 1, 2024

With every new model released by OpenAI they put it through a rigorous safety testing process, grading it against a set of criteria that determine whether it is safe to release. This has caused delays in the past and required guardrails to be placed on models to prevent certain actions.

One clear example of this is in the GPT-4o model, which is capable of generating images natively, producing music and even mimicking voices but all of those features are blocked by guardrails. You know it can do it because sometimes the guardrails break.

Agents wallow me to finally work through some of the quarter of a million unread emails. If Skynet is the price I have to pay to reach inbox zero — bring on the Terminators.

A guardrail breaking will be a bigger issue in the case of agents as they may have access to your bank account, the ability to go online and perform tasks or even hire someone on Fiver to do the task for them, using voice mode to give instructions.

In the Dev Day example we saw a voice bot call a seller (played by a researcher), order 400 chocolate-covered strawberries, give a specific address and say it would pay in cash. It declared its status as an AI assistant but you would struggle to tell it was AI sometimes.

Speaking to the FT, OpenAI’s chief product officer Kevin Weil said: “We want to make it possible to interact with AI in all of the ways that you interact with another human being,” adding that the agentic systems will hit the mainstream next year and make that goal possible.

Weil says one guardrail on agent systems would be to require it to always declare itself as AI, although if you’ve ever heard Advanced Voice beatbox or seen GPT-4o generate a perfect vector graphic, you’ll know those restrictions aren’t always perfect.

I am personally looking forward to the arrival of agents. I like to code and agents will allow me to implement it more quickly, taking over some of the boring testing stages. It will also allow me to finally work through some of the quarter of a million unread emails. If Skynet is the price I have to pay to reach inbox zero — bring on the Terminators.

More from Tom's Guide

Apple

Asus

Lenovo

Intel Core i5

Intel Pentium

128GB

256GB

512GB

1TB

Black

Grey

Silver

EMMC

SSD

Showing 10 of 14 deals

Filters☰

Apple MacBook Air M3

$799

View

Asus Zenbook S 13 OLED

(256GB 8GB RAM)

$1,078.99

View

Lenovo IdeaPad Duet 3

$369.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB Silver)

Our Review

☆☆☆☆☆

(15-inch 512GB)

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$399.99

$369.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB SSD)

Our Review

☆☆☆☆☆

(256GB SSD)

Asus Zenbook S 13 OLED

(OLED)

$1,599

View

TOPICS

As the former AI Editor for Tom's Guide, Ryan wielded his vast industry experience with a mix of skepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing.