I kept hitting AI limits — so I built a 3-step system that cut my usage by 60%

laptop anger — (Image credit: Shutterstock)

I didn’t realize how much of an AI power user I was until I was hitting my usage limit before 9 a.m. Whether I was using Gemini to create presentations in NotebookLM, Claude Cowork to run my husband's side hustle or even taking advantage of the apps in ChatGPT, I was burning through prompts like they were unlimited.

Now that Google has really put the hammer down on Gemini usage limits (and frankly, Claude has never been good with usage), I knew I needed a better system.

I didn't have time or the tokens to do extra follow-ups, because I knew getting locked out right when I actually got somewhere would happen. That’s when I decided to avoid inefficiency and I built a simple 3-step system to fix it. I call it a Token Buffer, and within a week, it cut my usage by about 60% without slowing me down.

Why this matters right now

Big Tech is investing billions in AI and now users are paying the price. We used to get so much more for free, which means we have to be far more strategic with how we prompt AI. As a certified prompt engineer, it's why I can't say this enough: one messy prompt can waste 5-10 follow ups. It's time to stop using AI like it's Google and start prompting with intention.

Article continues below

AI limits are changing how useful these tools actually are. Because message caps are tighter and "pro" tiers actually aren't unlimited, you might be burning through your usage without realizing it until it's too late.

My 3-step 'Token Buffer' system

The good news is, despite limited usage, the system is so easy anyone can use it. It's simply a small shift in how you use AI before, during and after each prompt.

Here's how it works:

Buffer before you ask. Start structuring your prompts. Rather than immediately typing, take 10-20 seconds to write out exactly what you need then add context upfront (goal, constraints, format). This combination turns 3-4 prompts into one. The result is fewer follow-ups and better first answers.
Batch your prompts. Stop drip-feeding models. Whatever chatbot you're using, rather than saying, "Help me with this" then "Change this," you're going to want to batch everything into one structured prompt with exactly what you need. The result gets you closer to the final answer right out of the gate rather than wasting prompts on refinement.
Extract once, reuse often. Instead of starting from scratch every time, I now save strong outputs, reuse frameworks, formats and structures that I know work. Plus, I always have memory enabled (except on Gemini). This results in avoiding spending tokens on the same problem twice.

What's changed for me (besides tokens)

After just a few days I was bracing myself to hit my limits, but I didn't. And, I was getting better results and getting more done in one prompt. By spending less time "chatting" I was actually getting results. I enjoy chatting and brainstorming with AI, but that's going to have to wait for a weekend. During the weekday when I've set up ChatGPT Tasks or have Claude working autonomously for me, I need to focus on not wasting a single token.

The big shift now is that power users and anyone on a free tier need to stop treating AI like a chatbot in a conversation and more like a system.

Try this before your next prompt: “Here’s my goal: [insert]. Constraints: [insert].
Output format: [insert]. Give me the best possible version in one response.”

The takeaway

It's been a good run with unlimited usage, but now it's time to buckle up for a new era of AI. The more integrated AI becomes into our daily lives, the more we're going to see usage become like a resource we have to pay for by use (think water, electricity, internet).

By making that shift now, you'll stop thinking in back-and-forth prompts and start thinking in systems, which will make those limits stretch a lot further than you expect. Let me know in the comments what you think about the "new era of usage limits." Are you prepared? How often do you hit limits? I'd love to hear your thoughts in the comments.

Click to follow Tom's Guide on Google News

Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds. Subscribe to Tom's Guide on YouTube and follow us on TikTok.

More from Tom’s Guide

Amanda Caswell is one of today’s leading voices in AI and technology. A celebrated contributor to various news outlets, her sharp insights and relatable storytelling have earned her a loyal readership. Amanda’s work has been recognized with prestigious honors, including outstanding contribution to media.

Known for her ability to bring clarity to even the most complex topics, Amanda seamlessly blends innovation and creativity, inspiring readers to embrace the power of AI and emerging technologies. As a certified prompt engineer, she continues to push the boundaries of how humans and AI can work together.

Beyond her journalism career, Amanda is a long-distance runner and mom of three. She lives in New Jersey.