You can now talk to ChatGPT — and it’ll talk back

OpenAI logo on a phone screen
(Image credit: Shutterstock)

Your conversations with ChatGPT are about to get way more personal.

OpenAI, the creator of ChatGPT, announced yesterday (Monday) that it will be launching new voice and image features for the AI chatbot over the next two weeks.

Those that pay for a ChatGPT Plus subscription, as well as Enterprise users, will soon be able to have back-and-forth conversations with ChatGPT. Those using the free version will still be limited to text input. The speech features will include a set of human voices generated by real voice actors. A new text-to-speech model paired with the open-source speech recognition system Whisper will be powering these life-like conversations.

OpenAI certainly put its best foot forward when it released short samples of what ChatGPT’s new voices sound like reading a poem or a speech. They’re an audible step up from the generic AI voices some websites serve up to (robotically) voice their long-read pieces.

Having trouble finding the right words when talking to ChatGPT? The second big upgrade that’s coming is image chat functionality. If you momentarily forget the plastic or metal tips of the best running shoes shoelaces are called aglets but you urgently need to ask ChatGPT if they can be replaced, simply snap a photo and send it to the chat. You can discuss multiple images or use the drawing tool to guide the AI about the specific part of an image you’re referring to.

The processing of the images will be powered by GPT-3.5 and GPT-4 models that can apply their language reasoning skills to different image types such as photographs, screenshots and documents containing both text and images, according to OpenAI.

Purposefully dumbed down

ChatGPT on an Android phone

(Image credit: Shutterstock)

In its announcement about these new features, OpenAI acknowledged they create the potential for people to try to impersonate public figures or commit fraud.

“This is why we are using this technology to power a specific use case — voice chat. Voice chat was created with voice actors we have directly worked with,” said OpenAI.

When it comes to image processing, ChatGPT’s ability to analyze and make statements about people in photos has been purposefully limited “since ChatGPT is not always accurate and these systems should respect individuals’ privacy”, the company said.

Voice and image features are being rolled out to ChatGPT Plus and Enterprise users over the next two weeks. Voice will be available for iOS and Android users provided they opt-in. Image features can be used on all platforms. 

More from Tom's Guide

Christoph Schwaiger

Christoph Schwaiger is a journalist who mainly covers technology, science, and current affairs. His stories have appeared in Tom's Guide, New Scientist, Live Science, and other established publications. Always up for joining a good discussion, Christoph enjoys speaking at events or to other journalists and has appeared on LBC and Times Radio among other outlets. He believes in giving back to the community and has served on different consultative councils. He was also a National President for Junior Chamber International (JCI), a global organization founded in the USA. You can follow him on Twitter @cschwaigermt.