Google I/O turns the focus to Gemini AI — here's what Google could announce

Google I/O 2024 graphic
(Image credit: Google)

Update: Today is Google I/O. Follow our Google I/O 2024 live blog for all the big news as it happens. 

Google hosts Google I/O 2024 Tuesday (May 14), and while there will be new updates across mobile, home and wearable devices during the annual developer event, AI will be center stage.

Google Gemini is the search giant's family of artificial intelligence models that's increasingly finding a space front-and-center in everything Google makes, from replacing Assistant on Android to powering analysis in search results.

What we’ll likely see at I/O is a new version of Gemini, further integration across yet more products and multimodal features coming to the Gemini chatbot, giving it the ability to take in speech, code, music and video for the first time. (You can find out what Google announces for yourself with our guide on how to stream the Google I/O keynote.)

Rumors suggest that we may also see Gemini adopt some of the more prominent features of rival OpenAI ChatGPT, including persistent memory across all conversations. But with OpenAI unveiling its new GPT-4o model with built-in voice assistant and vision features, Google will have to play catchup.

What to expect from the Gemini models

Gemini Pro 1.5

(Image credit: Google)

Google likes to confuse people — or at least that’s how it feels sometimes. The name Gemini applies to the underlying large language models, the Assistant replacement on Android, the chatbot and the AI auto complete in workspace.

To confuse things even further, there are three versions of Gemini. The first is Nano which runs on phones and small devices, Pro which runs in the cloud and powers the Assistant and the free version of the Gemini chatbot. Ultra is the most powerful model — at least, on paper — and it powers the $20/month Gemini Advanced.

Earlier this year, Google unveiled Gemini Pro 1.5. This was a big upgrade to the previous generation of Gemini as it added better understanding, music and video input and a massive million token context window — this is how much data it can store and reference from a single conversation.

Gemini Pro 1.5 is still only available to developers and researchers. While it doesn't have the reasoning of Gemini Ultra, in many ways it is more powerful.

At Google I/O, I suspect we will see some correction of this situation with 1.5 version upgrades to each of the free models in the family. They are also likely to be made available to the Gemini chatbot and Android Assistant.

New AI features at Google I/O

Google has already teased a new version of Gemini, which leverages Google's voice assistant and video features to describe what's going on in your camera's view and provide assistance. We expect to hear a lot more about this feature. 

Gemini can do a lot more than is currently possible through chat or voice interfaces. This includes taking in video and music content. I suspect both will be upgraded to add these new input options at I/O.

I think we will also see integration with other Google products and services, bringing more generative AI features to Photos, Docs and Slides. These will also be more tightly integrated into the Gemini Assistant and chatbot.

One of the more useful aspects of Gemini over ChatGPT is its deep integration with the Google ecosystem. Accessible via extensions, this includes access to search, flights maps, all of your documents and, of course, YouTube. Even YouTube Music is joining this extensions list — albeit only in the Android assistant version of Gemini.

While unlikely, one thing we might see is Google adding third-party providers to the extensions list. This would mirror functionality available in ChatGPT and Microsoft Copilot. If Google does integrate this, we could see companies like Uber and Kayak access Gemini. In Assistant, for example, you could plan a trip and manage all bookings from within chat, if this were to happen.

Google vs. the AI competition

A phone with the ChatGPT logo and a laptop with the OpenAI logo

(Image credit: Shutterstock)

The world is moving away from text and onto voice in terms of AI. This is being seen in the form of every AI lab working on synthetic voice solutions. 

We’re also moving away from chat and onto agents where you instruct the AI to perform a series of tasks on your behalf rather than just having a friendly chat. 

This is something we are already seeing from OpenAI. Apple is also said to be looking at this as an approach for Siri 2.0, which we expect to see at WWDC 2024 next month. And to some extent, Google is doing versions of this with the Gemini Assistant on Android.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Ryan Morrison
AI Editor

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?