I just had a conversation with Hume's new AI voice assistant — and I forgot it wasn't human

Hume AI on an iPhone screen
(Image credit: Shutterstock / Future)

Hume EVI is an artificial intelligence speech-to–speech voice assistant, and with the most recent version 2 update, it may be more natural and intuitive than OpenAI’s GPT-4o Advanced Voice.

The brainchild of Hume cofounder Alan Cowen and his team, EVI 2 builds on the previous generation model with a more natural-sounding voice and better emotional understanding.

According to Hume: “EVI 2 can converse rapidly with users with sub-second response times, understand a user’s tone of voice, generate any tone of voice, and even respond to some more niche requests like changing its speaking rate or rapping.”

My testing found it more natural than OpenAI’s Advanced Voice but slightly slower and with fewer capabilities. For example, EVI is more empathic in its vocal tone, but ChatGPT is better at laughing and conveying other sounds associated with the human voice.

What is Hume EVI 2?

EVI 2 is an empathic voice assistant, available like ChatGPT Voice or Gemini Live as a dedicated smartphone app, online or as an API developers can use in their own projects.

Hume's EVI 2 stands out from the crowd because of its flexibility. It is natively speech-to-speech and has its own LLM brain, but you can swap that for any other model, including GPT-4o or Gemini. You could even use EVI to give voice to Grok or Meta's Llama 3.1.

We’re building systems that can adapt the voice to the user automatically including adopting the right accent, taking a more relaxed or formal personality, whatever works to help you engage with the AI

Alan Cowen, Hume AI CEO

I spoke to Dr. Cowen ahead of the release of EVI 2 and he said the goal is to “give developers the tools to build what they want,” explaining that the other players in the space are building ecosystems around themselves. “We train on top of open-source models to give them voice.”

“The developer can take this model, and use whichever framework they want, we also enable voice modulation and personality voices,” he added. He also said in the future, there could be a small version of the model that could run on the edge, on a laptop or even on a smart speaker.

Outside of the API and developer tools, the Hume AI app is an impressive experience, allowing you to hold a conversation, brainstorm ideas or even get something off your chest with a natural-sounding AI voice that detects your vocal tone and reacts accordingly.

For fun I also had EVI 2 have a conversation with ChatGPT Advanced Voice. This is something I’ve tried with other AI models to limited effect but here it worked well. They started chatting away like old friends talking about recipes and hobbies.

“We’re building systems that can adapt the voice to the user automatically including adopting the right accent, taking a more relaxed or formal personality, whatever works to help you engage with the AI,” Dr Cowen told Tom’s Guide.

As well as using set voices developed by Hume, EVI 2 can also clone voices but this feature has been restricted, with users able to set identity-related voice characteristics to create a custom voice for each user, without cloning a real voice directly.

“GPT-4o is focused on the shiny capabilities, we’re focused on what the developer actually needs including the ability to modulate the voice without cloning,” Dr Cowen told me during an interview before the launch of the new model.

Their approach to voice development is prompt-based, where users just type what they want the voice to sound like, and the AI does the world. “We came up with voice prompting and it can just follow that personality,” he said. It can also generate other languages and accents.

How well does EVI 2 work?

I tried EVI 2 on the Hume AI website with several voices. I found it impressively natural sounding and could adapt its voice depending on how I spoke.

It is also a good storyteller, able to convey the emotional depth of a character. While it does match or even exceed the emotion mimicry of ChatGPT Voice, it lacks other features such as breathing sounds and holding noises that are common in human voice. That said, I still got distracted during a conversation, enough to forget it wasn’t human.

For fun, I also had EVI 2 have a conversation with ChatGPT Advanced Voice. I’ve tried this with other AI models to limited effect, but it worked well here. They started chatting away like old friends, talking about recipes and hobbies.

What makes EVI 2 an important step isn’t its capabilities; it is the company's wider approach. While you might use Advanced Voice in ChatGPT or Gemini Live on an Android handset, EVI could be built into any software or device — so it could be everywhere.

Its ability to track emotional responses through vocal tone could also prove helpful in the care sector, giving medical robots a bedside manner. Or it could be used to replace the automated voice on call waiting, able to soothe you out of an angry state despite still being number five million in the queue. It's got to be better than the lie: “your call is important to us.”

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 85 deals
Filters
Arrow
Load more deals
Ryan Morrison
AI Editor

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?