Google just launched Gemini 3.1 Flash-Lite — 7 prompts to test its new 'Thinking' mode

Google just launched Gemini 3.1 Flash-Lite — and while the “Pro” models grab headlines for Ph.D-level reasoning, this is the version most people will actually use all day long.

Flash-Lite is built for speed and efficiency. It’s lightweight, low-cost and optimized for the kinds of tasks you run dozens of times a day — summarizing emails, fixing code snippets, translating messages or extracting data from messy text. In other words, it’s designed for instant responses with better reasoning.

But "Lite" doesn't mean limited. With new adjustable Thinking Levels, you can tell Gemini 3.1 Flash-Lite to slow down and reason more carefully before responding. That means you get a noticeable accuracy boost without the heavy lag you’d expect from a larger model.

If you’re curious what Google’s fastest everyday AI can really do, here are 7 prompts worth trying right now.

Article continues below

1. The 'think twice' logic test

Gemini 3.1 Flash-Lite benchmarks — (Image credit: Google Gemini)

The Prompt: "Set thinking level to High. Solve this: A man is looking at a photograph of someone. His friend asks who it is. The man replies, 'Brothers and sisters, I have none. But that man's father is my father's son.' Who is in the photograph?"

One of the coolest features of Gemini 3.1 Flash-Lite is the ability to toggle its "Thinking" level. Most small models trip up on riddles, but Flash-Lite can handle them if you tell it to slow down.

By forcing the model into "High Thinking," you’re using the new Deep Think Mini tech to ensure it doesn't just guess the most common (and often wrong) answer.

2. The instant 'vibe code' landing page

screenshot of gemini code — (Image credit: Future)

The Prompt: "Write the HTML and Tailwind CSS for a sleek, dark-mode landing page for a fictional retro-synthwave record store called 'Neon Needle.' Include a hero section with a glowing 'Enter Shop' button."

Gemini 3.1 makes vibe coding easy —even if you're new to it, this model can take any idea you describe and builds the code. Flash-Lite is fast enough to do this in seconds. Flash-Lite excels at generating clean, functional code for UI/UX tasks almost instantly.

3. The multi-file PDF deep dive

A person on a laptop converting a PDF to a DOC — (Image credit: Shutterstock)

The Prompt: [Upload 3-4 PDFs, like an apartment lease or a terms of service agreement] "Compare these documents and create a bulleted list of the three most 'anti-consumer' clauses found across all of them. Use simple language."

With a 1-million token context window, you can throw massive documents at Flash-Lite. While Claude also has the same 1-million token context window, this model is arguably the best model for summarizing boring paperwork because it’s so cheap to run.

The model uses the massive context window to look at everything at once, rather than reading one file at a time.

4. No nonsense translation

The Prompt: "System Instruction: You are a professional translator. Output ONLY the translation with no intro or outro. Prompt: Translate this slang-heavy email into formal business Japanese: 'Hey team, we're totally crushing it, but we need to pivot the Q3 strategy before the investors freak out.'"

Small models are great at translation because they don't get "chatty." You're not going to get unnecessary follow-up questions or excess info. Flash-Lite is optimized for high-volume, low-latency tasks like this.

5. Video 'Clifnotes'

Prompt: "[Link a YouTube video of a tech keynote or recipe] "Find the exact timestamp where they mention how long this bakes and put the list of ingredients into bullet points."

You can feed Gemini 3.1 Flash-Lite an hour-long video, and it will "watch" it for you to find specific moments. Its multimodal "vision" is incredibly efficient at scrubbing through video frames to find visual or spoken data.

6. The structured data extractor

A stock photo of a person on their phone looking at a spreadsheet while several graphs are displayed on the laptop in front of them. — (Image credit: Shutterstock)

Prompt: [Paste a messy list of names, dates, and prices from an email] "Extract all the names and dates from this text and format it as a clean Markdown table. If a price is missing, put 'N/A' in that column."

If you have a messy pile of text, Flash-Lite can turn it into a clean table or JSON file for your spreadsheet. This is the "bread and butter" of the Lite model — taking unstructured "garbage" text and making it useful.

7. Real-time presentation coach

Lepow portable monitor in use during a meeting — (Image credit: Lepow)

Prompt: "I’m going to record myself practicing a 30-second elevator pitch. Listen to my audio, transcribe it, and tell me if I sounded too nervous or if my main point was clear."

Because it's so fast, Flash-Lite is the best candidate for "live" feedback. The low "Time to First Token" (TTFT) means you aren't awkwardly waiting for the AI to process your voice; it feels like a real conversation.

Try using this type of prompt with difficult conversations, parenting tone check, dating confidence, rambling check and so much more.

The takeaway

Flash-Lite is currently available in Google AI Studio and Vertex AI, where it’s optimized to deliver intelligence at a lower cost for developers and enterprises running high-throughput workloads.

You might assume “Lite” means watered down. In 2026, it really means faster and smoother. While Gemini 3.1 Pro is built for deep technical work, Flash-Lite is built for everyday speed — the version that summarizes your inbox, fixes a stray line of code or translates a message instantly, without making you stare at a spinning wheel.

Users of the Gemini app continue to have access to models like Gemini 3 Flash and 3.1 Pro, which offer equal or stronger performance across a range of benchmarks. These prompts will work with Gemini 3 Flash; give them a try and let me know what you think in the comments.

Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.

More from Tom's Guide

Amanda Caswell is one of today’s leading voices in AI and technology. A celebrated contributor to various news outlets, her sharp insights and relatable storytelling have earned her a loyal readership. Amanda’s work has been recognized with prestigious honors, including outstanding contribution to media.

Known for her ability to bring clarity to even the most complex topics, Amanda seamlessly blends innovation and creativity, inspiring readers to embrace the power of AI and emerging technologies. As a certified prompt engineer, she continues to push the boundaries of how humans and AI can work together.

Beyond her journalism career, Amanda is a long-distance runner and mom of three. She lives in New Jersey.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.