Our phones are getting smarter but our on-device transcription tools are not

Recorder app transcript on a Pixel 10 Pro
(Image credit: Future)

No matter the phone I'm using at any given time, I'm going to turn to the built-in voice recording app at some point during a typical work week. Recording interviews, getting audio notes of a meeting, laying down a voice-over track for a video — it doesn't matter what. Since my phone is the device on hand, I'm going to use the app on that phone to handle those tasks. And whether we're talking the iPhone's Voice Memos, the Pixel's Recorder or Voice Recorder on Samsung's phones, each app seems up to the task.

Transcribing those recordings? Well, that's another matter.

When phones first started baking transcription features into their voice recording apps, it felt like a big addition, particularly for people like me who have to record a lot of things, listen back to the recordings and, more often than not, produce accurate records of who said what. And generally, the results were pretty promising — at least if you chalked up transcription errors to a new feature finding its footing.

Well, that footing hasn't really gotten better in recent years. And with phone makers now toting their AI features and advancing on-device smarts, the low quality of transcription capabilities is really starting to stand out.

I was reminded of this a couple weeks back when I interviewed a pair of Qualcomm executives about the launch of the Snapdragon 8 Elite Gen 5 that's going to power many of the best Android phones debuting over the course of the next year. Our conversation, which focused on the expanding AI features that the latest Snapdragon silicon can support, was very enjoyable; going through the transcript generated by my Pixel 10 Pro was less so.

Transcript from the Recorder app on a Pixel 10 Pro

(Image credit: Future)

You had misspelled words, of course, and haphazard punctuation. That's par for the course with mobile transcription features, and something I'd expect to have to fix if I wanted a clean copy of my conversation. What I wasn't prepared for was how many misheard or mistranscribed words made it into the document — mistakes that utterly changed the meaning of what the person was actually saying. Or at least, it would in instances where the botched transcription offered any discernible meaning at all.

At one point, a Qualcomm executive toward me that an AI-powered personal assistant would be an "extension of you." What the Recorder app on the Pixel 10 Pro heard was that AI would be the "extinction of you." Kind of a different interpretation than what the speaker intended, no?

There were three of us speaking in that interview — the transcript identified a fourth speaker. And sometimes, the transcript would have different speakers jumping into the conversation when the audio revealed that it was the same speaker all along.

In one of the more maddening quirks, the on-device transcription would simply drop an entire sentence — the audio file would still have someone speaking quite audibly — before picking up the transcript midway through the next sentence. At least you can't make errors when you don't transcribe anything.

transcript of a call intercepted by call screening on an iphone 17

(Image credit: Future)

I don't want to pick on just the Pixel 10 Pro here, because similar features for Apple and Samsung phones run into the same issues. This past summer, me and my colleagues on the Tom's Guide phones team ran a series of tests to determine the best AI phone overall. It fell to me to conduct the transcription tests involving a Pixel 9, iPhone 15 Pro and Galaxy S25 Plus, and none of the three devices really covered themselves in glory.

Multiple speakers proves to be the biggest Achilles heel for transcription features. The Pixel and Galaxy phones regularly had one speaker overlapping another, while the iPhone doesn't even bother separating speakers when it transcribes your voice recordings.

It's not just the recording apps that suffer from this lack of quality control. Call screening is a big part of the iOS 26 update for iPhones — one of of my favorite parts, actually — and it relies on on-the-fly transcripts of recordings to let you know who's calling. The other day, I got a call from someone named Catherine — or so my iPhone would have me believe. The actual caller was Walgreens and the rest of the message about my prescriptions was no less muddled.

AI outlook

The promise of on-device transcription features — like a lot of the AI phone makers are pushing — is that it takes a mundane task out of our hands and handles it for us. But the reality of what different phone makers are offering is actually quite the opposite — it's actually creating more work, as I have to go through each recording transcript to make sure it matches the reality of the audio. That has to stop.

So I'm asking everyone — Apple, Google, Samsung — to sort out the transcription features they already have. Figure out how to reduce the errors and the miscues. Because until you do, it's going to be very hard to trust the even loftier claims about on-device AI and what it can do for me.

Follow Tom's Guide on Google News to get our up-to-date news, how-tos, and reviews in your feeds. Make sure to click the Follow button.

More from Tom's Guide

Philip Michaels is a Managing Editor at Tom's Guide. He's been covering personal technology since 1999 and was in the building when Steve Jobs showed off the iPhone for the first time. He's been evaluating smartphones since that first iPhone debuted in 2007, and he's been following phone carriers and smartphone plans since 2015. He has strong opinions about Apple, the Oakland Athletics, old movies and proper butchery techniques. Follow him at @PhilipMichaels.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.