The end of unlimited AI: Why Google’s Gemini leak is a warning for every power user

For the past year, AI felt like you could freely ask your questions and generate endlessly. And many users have, treating ChatGPT like a faster Google. The company behind Gemini was known to give away their best models "for free" so users could prompt heavily. Now, Google is making it clear those days are over.

In details first spotted by 9to5Google, what looked like a routine update revealed new usage limits across Gemini’s most powerful features. And the company isn’t just tweaking a pricing tier, it’s signaling that AI isn't unlimited, and it never really was. Once you see it, you can’t unsee it.

What actually changed with Gemini

Google’s latest update reshapes how access to Gemini works, particularly for higher-tier users. Limits are now:

Tied to specific features (like Deep Research and advanced tools)
Adjusted based on usage patterns
Structured more like a quota system than a free-flowing assistant

That might sound like a small backend change, but it fundamentally shifts how AI feels to use. Because now, there’s a ceiling. Whether we like it or not, our access has changed. You've probably already noticed it if you've felt your AI responses suddenly got shorter, less helpful or the response were slower.

To be clear, this isn't just happening with Gemini. There’s a growing pattern across tools like Claude, ChatGPT and Perplexity AI, too. When users hit a "soft limit" the responses change. If you use a premium feature too often such as Gemini Veo 3.1 or ChatGPT-5.5 Thinking, you may find access gets throttled. The model itself isn't getting worse, but the version you're using might be. In other words, you think you're using a more advanced model, but due to usage limits, you've quietly been dropped down to a lesser model.

Article continues below

This shift is subtle, and most people aren't even realizing it's happening. They may get a generic response and simply prompt again, only to push their usage limits even more.

The real reason this is happening

data center cooling — (Image credit: Shutterstock)

Big Tech is paying a lot of money for data centers, energy and training models, and users are now being taxed with helping to offset these soaring costs. This isn't about Google being restrictive, but about physics and the cost of massive GPU clusters, constant energy consumption and very expensive real-time inference.

That means, every “ask” isn’t free, it's a compute event. And, as millions of people start using AI like a daily assistant, those costs scale fast. So instead of offering unlimited access, companies are quietly shifting to tiered usage, feature grating and soft caps. It's kind of like how the best streaming services all started raising their prices once they had gotten enough people to sign up.

For power users, this means AI use is something you have to budget as we enter the "AI credit" era. Even if these companies aren't calling it out yet, that's what it is. Frankly, it's no different than mobile data plans, streaming tiers or cloud storage limits.

The takeaway

This shift changes how you should actually use AI. For example, generating AI images is fun, but it will cost you more than it used to. And, instead of treating AI like an endless chat, it's time to be more intentional with prompts and use different tools for different tasks. If you have heavy requests, save them for when they really matter.

The people who win with AI going forward will be the ones who know how to use is better and more strategically. What do you think? Will these limits make you rethink how you use AI? Let me know in the comments.

Click to follow Tom's Guide on Google News

Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds. Subscribe to Tom's Guide on YouTube and follow us on TikTok.

More from Tom’s Guide

TOPICS

Amanda Caswell is one of today’s leading voices in AI and technology. A celebrated contributor to various news outlets, her sharp insights and relatable storytelling have earned her a loyal readership. Amanda’s work has been recognized with prestigious honors, including outstanding contribution to media.

Known for her ability to bring clarity to even the most complex topics, Amanda seamlessly blends innovation and creativity, inspiring readers to embrace the power of AI and emerging technologies. As a certified prompt engineer, she continues to push the boundaries of how humans and AI can work together.

Beyond her journalism career, Amanda is a long-distance runner and mom of three. She lives in New Jersey.