Apple just unveiled new Ferret-UI LLM — this AI can read your iPhone screen

Adobe Firefly generated image of a ferret
(Image credit: Adobe Firefly/Future AI image)

Apple researchers have created an AI model that can understand what’s happening on your phone screen. It is the latest in a growing line-up of models.

Called Ferret-UI, this multimodal large language model (MLLM) can perform a wide variety of tasks based on what it can see on your phone’s screen. Apple's new model can for instance identify icon types, find specific pieces of text, and give you precise instructions for what you should do to accomplish a specific task.

These capabilities were documented in a recently published paper that detailed how this specialized MLLM was designed to understand and interact with mobile user interface (UI) screens. 

Article continues below

How Ferret-UI works

Apple Ferret

(Image credit: Apple)

We currently use our phones to accomplish a variety of tasks — we might want to look up information or make a reservation. To do this we look at our phones and tap on any buttons that lead us to our goals. 

Apple believes that if this process can be automated, our interactions with our phones will become even easier. It also anticipates that models such as Ferret-UI can help with things like accessibility, testing apps, and testing usability.

For such a model to be useful, Apple had to ensure that it could understand everything happening on a phone screen while also being able to focus on specific UI elements. Overall, it also needed to be able to match instructions given to it in normal language with what it’s seeing on the screen.

For example, Ferret-UI was shown a picture of AirPods in the Apple store and was asked how one would purchase them. Ferret-UI replied correctly that one should tap on the ‘Buy’ button.

Why Ferret-UI is important?

Apple Ferret

(Image credit: Apple)

With most of us having a smartphone in our pocket, it makes sense that companies are looking into how they can add AI capabilities tailored to these smaller devices. 

Research scientists at Meta Reality Labs already anticipate that we’ll be spending more than an hour every day either in direct conversations with chatbots or having LLM processes run in the background powering features such as recommendations.

Meta's chief AI scientist Yann Le Cun goes as far as to say AI assistants will mediate our entire digital diet in the future. 

So while Apple didn’t spell out what exactly its plans for Ferret-UI are, it’s not too hard to imagine how such a model can be used to supercharge Siri to make the iPhone experience a breeze, possibly even before the year is over.

More from Tom's Guide

Category
Arrow
Arrow
Back to Mobile Cell Phones
Storage Size
Arrow
Colour
Arrow
Condition
Arrow
Minimum Price
Arrow
Any Minimum Price
Maximum Price
Arrow
Any Maximum Price
Showing 10 of 56 deals
Filters
Arrow
Our Review
1
Apple iPhone 15 Pro Max 512GB
Verizon
Our Review
2
Apple iPhone 15 128GB
Verizon
Our Review
3
Apple iPhone 15 Plus 128GB
Verizon
Our Review
4
Apple iPhone 15 Pro 128GB
Verizon
bundle
(512GB Purple)
Our Review
5
Apple iPhone 14 Plus 512GB in...
Verizon
Our Review
6
Apple iPhone 15 Pro Max 512GB
Visible
(Blue)
Our Review
7
Apple iPhone 15 128 GB in...
Visible
Our Review
8
Apple iPhone 15 Plus 128GB
Visible
Our Review
9
Apple iPhone 15 Pro 128GB
Visible
(128GB)
Our Review
10
PreOwned Apple iPhone 14 Plus...
QVC - US
Show more
Christoph Schwaiger

Christoph Schwaiger is a journalist, mainly covering AI, health, and current affairs. His stories have been published by Tom's Guide, Live Science, New Scientist, and the Global Investigative Journalism Network, among other outlets. Christoph has appeared on LBC and Times Radio. Additionally, he previously served as a National President for Junior Chamber International (JCI), a global leadership organization, and graduated cum laude from the University of Groningen in the Netherlands with an MA in journalism. You can follow him on X (Twitter) @cschwaigermt.