OpenAI is now testing ChatGPT against humans in 44 different occupations, from lawyers and software developers to registered nurses — here's the full list of jobs affected

ChatGPT logo on smartphone next to a laptop
(Image credit: Shutterstock)

OpenAI, the company behind ChatGPT, has announced a new benchmark for testing its GPT-5 model, which involves pitting the AI directly against human experts in a variety of occupations.

The GDPval full set includes 1,320 specialized tasks, each meticulously crafted and vetted by experienced professionals with over 14 years of experience on average from these fields.

OpenAI

The benchmark is called GDPval and is responsible for assessing how close ChatGPT is getting to outperforming humans at "economically valuable, real-world tasks". That means moving beyond things like academic tests and coding competitions towards jobs that are carried out in the real world: nursing, financial management, engineering or journalism.

This is all part of OpenAI's effort to establish artificial general intelligence (AGI) and the company notes that its GPT-5 model (and Anthropic’s Claude Opus 4.1) “are already approaching the quality of work produced by industry experts.”

A graph showing the various AI models and how they compare when tested against a human expert in a particular industry.

A graph showing the various AI models and how they compare when tested against a human expert in a particular industry. (Image credit: OpenAI)

In a blog post explaining the new testing, OpenAI explained: "Unlike traditional benchmarks, GDPval tasks are not simple text prompts.

"They come with reference files and context, and the expected deliverables span documents, slides, diagrams, spreadsheets, and multimedia. This realism makes GDPval a more realistic test of how models might support professionals."

"The GDPval full set includes 1,320 specialized tasks (220 in the gold open-sourced set), each meticulously crafted and vetted by experienced professionals with over 14 years of experience on average from these fields. Every task is based on real work products, such as a legal brief, an engineering blueprint, a customer support conversation, or a nursing care plan."

What jobs is OpenAI testing ChatGPT against?

The tasks covered 44 different jobs across nine different industries. Here's the full list:

Real Estate, rental and leasing

  • Concierges
  • Property, real estate, and community association managers
  • Real estate sales agents
  • Real estate brokers
  • Counter and rental clerks

Government

  • Recreation workers
  • Compliance officers
  • First-line supervisors of police and detectives
  • Administrative services managers
  • Child, family, and school social workers

Manufacturing

  • Mechanical engineers
  • Industrial engineers
  • Buyers and purchasing agents
  • Shipping, receiving, and inventory clerks
  • First-line supervisors of production and operating workers

Professional, scientific, and technical services

  • Software developers
  • Lawyers
  • Accountants and auditors
  • Computer and information systems managers
  • Project management specialists

Health and social care

  • Registered nurses
  • Nurse practitioners
  • Medical and health services managers
  • First-line supervisors of office and administrative support workers
  • Medical secretaries and administrative assistants

Finance and insurance

  • Customer service representatives
  • Financial and investment analysts
  • Financial managers
  • Personal financial advisors
  • Securities, commodities and financial services sales agents

Retail

  • Pharmacists
  • First-line supervisors of retail sales workers
  • General and operations managers
  • Private detectives and investigators

Wholesale trade

  • Sales managers
  • Order clerks
  • First-line supervisors of non-retail sales workers
  • Sales representatives, wholesale and manufacturing, except technical and scientific products
  • Sales representatives, wholesale and manufacturing, technical and scientific products

Media

  • Audio and video technicians
  • Producers and directors
  • News analysts, reporters, and journalists
  • Film and video editors
  • Editors

So, will AI take my job?

It's the $64,000 question and the answer, probably, is yes. Or at least AI will take some measure of your job. OpenAI itself notes GDPval is an "early step that doesn’t reflect the full nuance of many economic tasks."

Additionally, while the test "spans 44 occupations and hundreds of knowledge work tasks, it is limited to one-shot evaluations, so it doesn’t capture cases where a model would need to build context or improve through multiple drafts."

There's still a long way to go, and a recent study claimed ChatGPT still routinely gets things wrong. But OpenAI is working hard on hitting AGI and says that future versions will extend to more interactive workflows and context-rich tasks to "better reflect the complexity of real-world knowledge work".

The fact that AI will reshape our working landscape is pretty much a foregone conclusion at this point. But the way in which it's integrated into most societies is still very much in the hands of humans, business leaders and customers. There will always be work for humans to do, that's also a foregone conclusion, but the type of work is almost certain to look a lot different in the decades to come.

Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button!

More from Tom's Guide

Category
Arrow
Arrow
Back to Laptops
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 255 deals
Filters
Arrow
Show more
TOPICS
Jeff Parsons
UK Editor In Chief

Jeff is UK Editor-in-Chief for Tom’s Guide looking after the day-to-day output of the site’s British contingent.

A tech journalist for over a decade, he’s travelled the world testing any gadget he can get his hands on. Jeff has a keen interest in fitness and wearables as well as the latest tablets and laptops.

A lapsed gamer, he fondly remembers the days when technical problems were solved by taking out the cartridge and blowing out the dust.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.