Google’s New AI Puts Us One Step Closer to Star Trek’s Universal Translator
Translatotron could perform speech-to-speech translation in your own voice, with no intermediate steps.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Daily (Mon-Sun)
Tom's Guide Daily
Sign up to get the latest updates on all of your favorite content! From cutting-edge tech news and the hottest streaming buzz to unbeatable deals on the best products and in-depth reviews, we’ve got you covered.
Weekly on Thursday
Tom's AI Guide
Be AI savvy with your weekly newsletter summing up all the biggest AI news you need to know. Plus, analysis from our AI editor and tips on how to use the latest AI tools!
Weekly on Friday
Tom's iGuide
Unlock the vast world of Apple news straight to your inbox. With coverage on everything from exciting product launches to essential software updates, this is your go-to source for the latest updates on all the best Apple content.
Weekly on Monday
Tom's Streaming Guide
Our weekly newsletter is expertly crafted to immerse you in the world of streaming. Stay updated on the latest releases and our top recommendations across your favorite streaming platforms.
Join the club
Get full access to premium articles, exclusive features and a growing list of member rewards.
Google is developing a new AI speech-to-speech translator technology that could imitate your voice in real time. Its name is Translatotron, and it may be key to achieving the seamless universal translation dream we see on Star Trek within our lifespan.
The new tech skips the entire speech-to-text step of current translation technologies. Right now, you could speak into your microphone, get your speech recognized, and your phone would output the translated text to the screen with the option of reading it out loud in a generic synthetic voice.
The new Translatotron aims to do two things. First, eliminate the speech-to-text step and thus avoiding text-to-speech synthesis, going directly to a speech-to-speech model. And then, get rid of the generic voice and replace it with your own voice. While not perfect, the examples in the Google Research github page are pretty good (check out the “Predictions with voice transfer” column in the second “Conversational Spanish-to-English” section).
The paper — published on ArXiv by Google’s research scientists Ye Jia, Ron Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, and Yonghui Wu — describes that the team used a neural network to analyze the original speech spectrograms into target spectrograms in another language, reproducing the original voice.
The researchers acknowledge that the result is not perfect — yet. They are getting there, as this first research was to demonstrate the feasibility of this model: “The proposed model slightly underperforms a baseline cascade of a direct speech-to-text translation model and a text-to-speech synthesis model, demonstrating the feasibility of the approach on this very challenging task.”
The technology opens a path to a Star Trek-like future in which you would speak and, automagically, people will hear you actually speaking in their own language. Perhaps then humans will be able to understand each other, and get past one of the barriers that separate societies from one another. The end of Babel is nigh!
Get instant access to breaking news, the hottest reviews, great deals and helpful tips.
Jesus Diaz founded the new Sploid for Gawker Media after seven years working at Gizmodo, where he helmed the lost-in-a-bar iPhone 4 story and wrote old angry man rants, among other things. He's a creative director, screenwriter, and producer at The Magic Sauce, and currently writes for Fast Company and Tom's Guide.

