You can now convert your voice into any other using a new tool from ElevenLabs. The AI platform says the conversion keeps any emotion and intonations expressed in the original recording and carries it to the new voice.
The quality of synthetic speech has improved dramatically in recent months, as has the speed of training an AI model on an entirely new voice. ElevenLabs lets you clone your own voice from a minute of audio, or create one by describing how it should sound.
Previously this was only available for text-to-speech conversion, which often lost some of the hidden meaning present in natural spoken language. It also struggled to process unknown words such as company or product names, or unusual personal names.
With its new voice-to-voice model ElevenLabs promises that you can, “say it how you want it and transform your voice into another character, with full control over emotions, timing, and delivery.”
How to use Voice-to-Voice technology
To test these claims, I had ChatGPT write a short radio play scene featuring three distinct characters. Then using my broken, flu-raddled voice, I recorded all the parts into ElevenLabs, selecting a different synthetic voice for each.
I didn’t speak particularly clearly at any point, put on faux and terrible American accents, used slang terms, and left dramatic pauses. For the most part, it did well with the emotion and intention of the phrasing — but it did struggle to convert a few letters and even whole words.
If you want to try creating your own virtual voices the process is fairly straightforward and there are a large number of options available even on the free plan.
Cloning a voice requires a premium plan and you are limited to 5,000 letters, spaces, and characters per month without paying — so if you have aspirations of making a radio play, you may want to consider an upgrade.
1. Register for an ElevenLabs account
You can use Google to sign up or register directly with ElevenLabs. The process is straightforward and you’ll be registered with the default, free plan from the start. Simply click Sign Up in the top-left corner of the screen. When registered you will be taken straight to the voice synthesis page.
2. Switch to speech-to-speech
It defaults to text-to-speech, where you enter words you want to use and it says it out loud. To access the new speech-to-speech tool just click that button on the top row.
3. Select a voice you want to use
There are dozens of pre-loaded voices, and it will remember any you’ve used recently. You can also find more voices by clicking on Voice Library. But to get things moving the list of voices available by default should be enough. You can press the play icon next to the voice to preview it, or just click on the name to have it selected.
4. Fine-tune the voice in settings
You can fine-tune the way the voice reacts and sounds including exaggerating the style, making it closer to the original voice, and adding a greater or lesser degree of variability. To access this click on voice settings.
5. To upload or record directly - that is the question
Next, you have to decide whether to just record directly into ElevenLabs, as I did with the short radio play scene, or to upload a pre-recorded piece of audio. Uploading a clip could be useful if you've been in a studio, or want to mess with a friend by changing their voice.
To upload a clip simply click the play button with the + in the top-right corner. Or to record simply click the record audio button.
6. Recording the audio requires another click
If you decide to record directly with ElevenLabs it'll change to a microphone in a circle. To start recording simply click the microphone icon. It will change to a stop icon and you click that to stop recording.
7. Playback, delete or get on with it
Once you’ve finished the recording it will allow you to play back what you’ve recorded. To do this just click the play icon. This will be in your own voice. If you don’t like it click delete and it will take you back to step 5, where you can record or upload a clip.
8. Let's change the voice
If you’re happy with the recording simply click the generate button. It takes anything up to a couple of minutes depending on the length of the recording.
9. Play, download and use the audio
Once the new voice has been generated it will automatically start playing. A new menu will appear at the bottom of the screen with playback controls. You can also download the generated audio from this menu by clicking the upward arrow icon on the right.
More from Tom's Guide
Get the BEST of Tom’s Guide daily right in your inbox: Sign up now!
Upgrade your life with the Tom’s Guide newsletter. Subscribe now for a daily dose of the biggest tech news, lifestyle hacks and hottest deals. Elevate your everyday with our curated analysis and be the first to know about cutting-edge gadgets.
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?