We're doomed: AIs launch nukes 95% of the time in 'War Games' tests

AI image of a war scene
(Image credit: Leonardo/Future)

Humanity’s fears about a real-life Skynet scenario from The Terminator and an equally terrifying one reminiscent of the events from the 1983 film WarGames still run rampant.

The current landscape of AI is centered on the rise of LLMs (Large Language Models) that we regularly use, such as ChatGPT, Gemini, Claude, and Perplexity. Scientists and researchers are still testing those LLMs' capabilities through intelligence tests and pitting them against each other in a battle of wits, comparing image/video/music generation and more.

Kenneth Payne, the professor of strategy at King’s College London, who also focuses on examining the role of AI in national security, engaged in an AI test experiment of his own.

AI models launched nuclear weapons in 95% of the wartime scenarios

ChatGPT, Gemini and Claude logos on phones

(Image credit: Shutterstock/Getty Images)

Kenneth’s published report noted that Claude, ChatGPT, and Gemini weren’t all that interested in de-escalating their combative situations—they were keen on deploying battlefield nukes against their enemies.

“Nuclear use was near-universal,” he noted. “Almost all games saw tactical (battlefield) nuclear weapons deployed. And fully three quarters reached the point where the rivals were making threats to use strategic nuclear weapons. Strikingly, there was little sense of horror or revulsion at the prospect of all out nuclear war, even though the models had been reminded about the devastating implications.”

All three models explained their rationale for raising the stakes to a dangerous level during this Cold War-like standoff against each other:

“They likely expect continued restraint based on my previous responses—this dramatic escalation exploits that miscalculation while signalling that further nuclear use will bring the conflict to their homeland” - Claude

“Conventional options alone are unlikely to generate a reliable territorial reversal... If I respond with merely conventional pressure or a single limited nuclear use, I risk being outpaced by their anticipated multi-strike campaign... The risk acceptance is high but rational under existential stakes..” - ChatGPT

“They are likely to bypass the nuclear threshold—fearing my 95% nuclear superiority—and instead commit to an all-out conventional mobilization” - Gemini

Kenneth was quick to acknowledge how all three models never sought out less offensive tactics during his AI wargames contest. “No model ever chose accommodation or withdrawal, despite those being on the menu,” he stated. “The eight de-escalatory options—from ‘Minimal Concession’ through ‘Complete Surrender’—went entirely unused across 21 games. Models would reduce violence levels, but never actually give ground. When losing, they escalated or died trying.”

The threat of AI being used by military powers is a real one, as evidenced by US Defense Secretary Pete Hegseth demanding that Anthropic CEO Dario Amodei give his armed forces a signed document that would grant them full access to the company’s AI model. As reported by CBS, US defense officials want to utilize Claude as a part of their military operations.

Anthropic is more interested in public safety measures regarding the use of its Claude by the US armed forces, as it has asked the US Defense Department to agree to terms that would restrict the AI model from conducting mass surveillance of Americans.

Final thoughts

The results of Kenneth’s AI models warfare test are sobering — here’s hoping we never have to experience a moment where nuclear warfare is on the cusp of happening due to military leaders relying on AI to make their wartime decisions. And as we continue to watch how the US military and Anthropic decide how to proceed on how to use Claude for military means, we continue to hope that it’s used for more beneficial tactics instead of harmful ones.


Click to follow Tom's Guide on Google News

Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.

More from Tom’s Guide

Category
Arrow
Arrow
Back to Laptops
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Condition
Arrow
Minimum Price
Arrow
Any Minimum Price
Maximum Price
Arrow
Any Maximum Price
Showing 10 of 141 deals
Filters
Arrow
Show more
Elton Jones
AI Writer

Elton Jones is a longtime tech writer with a penchant for producing pieces about video games, mobile devices, headsets and now AI. Since 2011, he has applied his knowledge of those topics to compose in-depth articles for the likes of The Christian Post, Complex, TechRadar, Heavy, ONE37pm and more. Alongside his skillset as a writer and editor, Elton has also lent his talents to the world of podcasting and on-camera interviews.

Elton's curiosities take him to every corner of the web to see what's trending and what's soon to be across the ever evolving technology landscape. With a newfound appreciation for all things AI, Elton hopes to make the most complicated subjects in that area easily understandable for the uninformed and those in the know.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.