Google flashes everyone — new Gemini Flash 1.5 takes on GPT-4o

Gemini 1.5 Flash
(Image credit: Google)

Google has launched a new member of the Gemini family of artificial intelligence models. Sitting between the on-device Nano and cloud-based Pro, Gemini Flash is designed for chat, complex tasks that require a fast response and handling images, video and speech.

Unveiled at the annual Google I/O developer event, Gemini Flash 1.5 is a native multimodal model similar to OpenAI’s recently unveiled GPT-4o and was built for speed, making it useful for real-time conversations.

The new model is currently available globally for developers to use in their own applications, so we could see a number of third-party live chat apps built using Gemini Flash 1.5 soon. 

We also saw an upgrade to Gemini Pro 1.5, the model first released earlier this year and the news it will now power the Gemini Advanced premium chatbot.

What makes Gemini Flash 1.5 different?

Gemini Flash 1.5

(Image credit: Google)

Gemini Flash 1.5 sits just above Nano and just below Pro in the size hierarchy and what makes it different, not just to its siblings but other AI models, is the combination of speed and agility.

In addition to being fast and impressive in its ability to understand text, images, video and speech, Flash 1.5 is cheap — at least compared to Pro which is 20 times more expensive.

“We know from user feedback that some applications need lower latency and a lower cost to serve,” said Google DeepMind CEO Demis Hassabis. “This inspired us to keep innovating,” he added, unveiling Flash as a “model that’s lighter-weight than 1.5 Pro, and designed to be fast and efficient to serve at scale.”

A good comparison, at least in terms of speed, is with OpenAI’s recently announced GPT-4o model. It is very fast, natively multimodal and designed for real-time interaction. That said, Gemini Flash 1.5 seems to be a less capable model in terms of reasoning.

What about the massive context window?

Gemini 1.5 Flash tokens

(Image credit: Google)

Like other Gemini family models, Flash 1.5 comes with a massive one million token context window and the promise of actually being able to utilize it in full. In comparison, GPT-4o has a 128,000 token content window and Claude 3 is at 200,000 tokens.

What makes a large context window so important is the ability to hold a massive amount of information in its memory within a single conversation. This is vital when it comes to analyzing non-text content as an image is worth 1,000 words and a video even more.

It was also trained by its big brother, Gemini Pro 1.5. Hassabis said this was done “through a process called ‘distillation,’ where the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model.”

“1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more,” as a result of this process, he said.

As these models, including the faster but smaller ones like Flash, gain the ability to understand more than just text that increased context window becomes even more important.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Showing 10 of 100 deals
Filters
Arrow
Load more deals
Ryan Morrison
AI Editor

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?