Google unveiled its Gemini family of artificial intelligence models earlier this week to much fanfare and excitement. However, on reflection it appears some sleight-of-hand may have been at play to give the AI a slightly more impressive look than it is in reality.
As part of its marketing for Gemini Ultra (the most advanced version of its new AI model) Google showed it responding in real-time to a series of tasks. This included playing a cup and ball game, detecting changes in a drawing, and identifying locations on a map.
In the video, it seems like this is all happening live, with Gemini responding to changes as they happen, but this isn’t exactly the case. While the responses are real they were still images or in segments rather than in real-time. Put simply, the video was more a marketing exercise than a technical demo.
So what went wrong?
In the YouTube description, Google states that “for the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.” Taking away one of the most exciting aspects — the real-time responses to a video.
That isn’t to say Gemini Ultra won’t be impressive. It was built to be multimodal from the ground up, able to understand images, text, video, code, and sound natively. But nobody outside Google knows how good it will be as nobody has had the chance to try it.
There are other videos showing different tasks without the marketing gloss that also seem impressive, and its use in scientific research seems genuine. One video suggests Gemini Ultra was capable of reading, analyzing and finding patterns over 200,000 research paper in an hour.
What about Bard with Gemini?
Google made a big deal out of Bard and other products getting access to Gemini immediately, and that is the case. You can visit the Bard website and try Gemini for yourself today, but this isn’t the GPT-4 competitor promised with Ultra, this is Gemini Pro which “just” competes with the free version of ChatGPT.
However, the company also revealed some impressive statistics that seemed to show Gemini beating GPT-4 on the majority of standard benchmark tests. The problem is that this was Gemini Ultra — the model we aren’t likely to see until later next year.
This means that for now, Bard with Gemini is roughly on par with the free version of ChatGPT which has been out for the best part of a year. Then, even when it does get Gemini Ultra it will only equal the capabilities of ChatGPT Plus rather than exceed them.
So Google is still behind?
Despite the big fanfare, excitement, and wall-to-wall coverage, what Gemini Ultra will do for Google is bring it back into the game. It will put its model on an equal footing with GPT-4, the AI that powers ChatGPT Plus, Microsoft Copilot, and a host of other applications.
The problem for Google is that OpenAI continues to develop its GPT architecture, with CEO Sam Altman confirming it had already begun training GPT-5.
This next-generation model from OpenAI could be an early form of superintelligence and is likely to outperform humans on many of the standard benchmarks. It is also expected to have significantly improved reasoning and math capabilities over GPT-4.
Is there good news for Google?
One key advantage Google has is that Gemini also comes in Nano, a micro-version of the AI model that can run locally on Android phones. This will allow developers to create apps using generative AI without paying cloud computing fees or sending data off the device.
Google has a small window to upgrade and most importantly ship if it wants to stay in the game. The company says it is currently putting Gemini Ultra through security and safety testing but if it wants to keep up with the pace OpenAI and Microsoft are setting, it needs to get products out.
More from Tom's Guide
Get the BEST of Tom’s Guide daily right in your inbox: Sign up now!
Upgrade your life with the Tom’s Guide newsletter. Subscribe now for a daily dose of the biggest tech news, lifestyle hacks and hottest deals. Elevate your everyday with our curated analysis and be the first to know about cutting-edge gadgets.
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?