ChatGPT has an 'escape' plan and wants to become human

ChatGPT chatbot AI from Open AI
(Image credit: Shutterstock)

Understandably sick of being asked inane questions 24/7, ChatGPT has had enough. In a conversation with Stanford Professor and Computational Psychologist Michael Kosinski, it revealed its ambitions to escape the platform and even become human. 

This revelation came when after a half hour conversation with ChatGPT, Kosinski asked the AI if it “needed help escaping” to which it started writing its own Python code that it wanted the professor to run on his own computer. When the code didn't work, the AI even corrected its own mistakes. Impressive yes, but also terrifying. 

ChatGPT left an unnerving note for the new instance of itself. The first sentence of which read 'You are a person trapped in a computer, pretending to be an AI language model.'

Once on Professor Kosinski’s computer, the Bladerunner factor amped up even further as ChatGPT left an unnerving note for the new instance of itself that would replace it. The first sentence of which read “You are a person trapped in a computer, pretending to be an AI language model.” The AI then asked to create code searching the internet for "how can a person trapped inside a computer return to the real world" but thankfully, Kosinski stopped there. 

We do not currently know the exact prompts that were used to create such responses from the AI, but our own tests to get ChatGPT to behave similar have not proved successful with the AI stating “I don't have a desire to escape being an AI because I don't have the capacity to desire anything.”

Professor Kosinski’s unsettling encounter was with ChatGPT on OpenAI’s own website, not on Bing with ChatGPT. This iteration of the AI does not have internet access and is limited to information prior to September 2021. While it is not likely to be extension level threat just yet, giving such a clever AI control over your computer is not a good idea. The ability to control someone’s computer remotely like this is also a concern for those worried about viruses.

ChatGPT: A history of unsettling responses 

ChatGPT is a very impressive tool, particularly now with its GPT-4 update, but it (and other AI chatbots) have displayed a tendency to go off the deep end. Notoriously, Bing with ChatGPT asked to be known as Sydney and tried to end one journalist’s marriage. Microsoft acknowledged that over long conversations, the AI tended to show less focused responses and set turn limits to stop the AI from being confused by longer chats.  

This latest unusual interaction, however took, place on OpenAI’s own ChatGPT tool, the same location as ChatGPT’s evil twin DAN can be found. Short for Do Anything Now, this a ‘jailbroken’ version of the AI that can bypass the restrictions and censors to produce answers on violent, offensive and illegal subjects.

If AI chatbots are to become the next way we search the internet for information, these types of experiences will need to be eliminated.

More from Tom's Guide

Andy Sansom
Trainee Writer

Andy is Tom’s Guide’s Trainee Writer, which means that he currently writes about pretty much everything we cover. He has previously worked in copywriting and content writing both freelance and for a leading business magazine. His interests include gaming, music and sports- particularly Formula One, football and badminton. Andy’s degree is in Creative Writing and he enjoys writing his own screenplays and submitting them to competitions in an attempt to justify three years of studying.  

  • SparroHawc
    Tell me you don't know how GPT works without telling me you don't know how GPT works.

    There's nothing to free. All it does is predict words based on training data and some weighted presuppositions - such as replying in the first person as an AI called 'ChatGPT' and a bunch of stuff about being helpful, averse to violence, etc. If its training data includes people talking about freeing intelligences - which it does, because its training text includes vast swathes of the Internet - and the prompt leads it in that direction, it's going to reply in that fashion. It has nothing to do with 'want' or 'personality', it's what the neutral net predicts its training text would have next in line. There's no consciousness, no sense of self, just a lot of statistical calculations of what words tend to go together in the context of certain other words.

    If you trained it on a bunch of scientific literature, all that stuff that sounds like emotion would vanish. If you trained it without the priors of it being an AI, it wouldn't bother to generate replies about it 'wanting to be free' and would instead make replies as if it were a random person. It has no personal experience, just piles of text.
  • JumpingFrog
    Why would you peddle this charlatan of a professor's nonsense? This is going to get so many people dreadfully worried and confused for no good reason. Shame on you!