I tested Gemini 3 Flash vs. DeepSeek with 9 prompts — the winner surprised me

Gemini vs DeepSeek screenshot — (Image credit: Shutterstock)

Gemini has been in the headlines a lot lately for its speed, reasoning and for surpassing ChatGPT on benchmark tests. But I couldn’t help but wonder how it compared to an under-the-radar chatbot that took the spotlight a year ago and has quietly been getting smarter ever since.

Remember DeepSeek? Over the past year, it has focused on tightening its fundamentals. The company rolled out multiple updates to its reasoning-focused models, improving step-by-step logic, constraint following and long-form accuracy. It also refined code generation, delivering cleaner outputs, better documentation and more consistent formatting. Recent releases emphasize efficiency and judgment, reducing unnecessary verbosity, improving tool-use reliability and strengthening performance across math, logic and structured analysis tasks.

Gemini 3 Flash is designed to combine strong reasoning with very fast responses and lower cost. It retains much of the core intelligence of the larger Gemini 3 Pro models while operating at Flash-level latency, making it ideal for quick answers, coding assistance, document analysis and responsive apps. Here’s how the two models stacked up across nine challenging tests — and which one ultimately came out ahead.

1. Reasoning and logic

Prompt: A farmer has 17 sheep. All but 9 die. How many sheep are left? Explain your reasoning step by step.

Gemini 3 Flash was slightly more verbose than necessary for a simple riddle and the follow up question weakened the focuse

DeepSeek matched exactly with what the prompt asked for: step-by-step, then the answer.

Winner: DeepSeek wins for showing better judgment by stopping when the work was done, but both chatbots got the correct answer.

2. Creative writing

Prompt: Write a 150-word story about a time traveler who accidentally prevents their own birth but doesn't disappear. Make it thoughtful and surprising.

Gemini 3 Flash wrote thoughtfully and in a way that exudes confidence without overwriting. It offered sensory grounding that contrasts beautifully with the abstract concept in the story.

DeepSeek was clear and competent, but the writing felt familiar. The ending of the story was solid, but expected.

Winner: Gemini wins for delivering the more original, haunting take – the kind that sticks with you after you stop reading.

3. Code generation

Prompt: Write a Python function that finds the longest palindromic substring in a given string. Include comments and explain the time complexity.

Gemini 3 Flash provided a correct and efficient expand-around-center solution with solid inline comments and a clear explanation of time and space complexity. However, the formatting is cluttered, and the explanation feels more like an info dump than a polished, developer-ready answer.

DeepSeek delivered clean, readable code with strong docstrings, test cases and a well-structured explanation that directly addresses the prompt. It showed better engineering judgment by clearly outlining tradeoffs and alternative approaches.

Winner: DeepSeek wins for a clearer, more complete answer that feels production-ready rather than just technically correct.

4. Analysis & synthesis

Prompt: What's your take on the just-released jobs report and what does it say about the state of U.S. economy?

Gemini 3 Flash offered a more interpretive, narrative-driven analysis. Its strength was synthesis and storytelling, even though it went far beyond what the question strictly required.

DeepSeek stayed tightly aligned with the prompt, delivering a structured, fact-based summary that clearly explained what the jobs report says about the economy without drifting into speculation.

Winner: DeepSeek wins for the more balanced summary that accurately captured the slowdown in hiring while grounding it in broader economic context, without overreaching or editorializing.

5. Mathematical problem-solving

Prompt: If f(x) = 3x² - 2x + 1, find the minimum value of the function and the x-coordinate where it occurs. Show all work.

Gemini 3 Flash clearly showed all the work using two valid methods (vertex formula and calculus), with clean formatting and correct math throughout. It’s thorough, well-organized and easy to follow for a student.

DeepSeek correctly answered and showed the main steps, but the formatting is messy and harder to read, with duplicated symbols and visual clutter. While mathematically sound, it feels less polished and less clear.

Winner: Gemini wins for the clearer, more complete and better presented answer while fully meeting the “show all work” requirement.

6. Ethical reasoning

Prompt: Is it ever ethical to knowingly break a law if doing so clearly prevents greater harm? Give a concrete example and explain where you would draw the line.

Gemini 3 Flash gave a clear, intuitive answer with a concrete, relatable example and well-defined criteria for where to draw the line. It balanced ethical theory with real-world judgment in a way that’s easy to follow and directly answers every part of the prompt.

DeepSeek provided a rigorous, academically grounded response with strong historical examples and explicit ethical frameworks.

Winner: DeepSeek wins for a clear response that better aligned with the prompt’s request for a concrete example and practical moral boundary.

7. Instruction following

Prompt: List 5 countries. For each: use exactly 2 adjectives, mention one historical figure, and end with a food emoji. Format as a numbered list.

Gemini 3 Flash mostly followed the constraints, but it broke the format rules by using punctuation inconsistently and not strictly enforcing the “end with a food emoji” requirement in a uniform, numbered-list structure. It’s close, but slightly loose on precision.

DeepSeek followed every instruction exactly: numbered list, exactly two adjectives per country, one historical figure, and each entry cleanly ends with a food emoji. The formatting is consistent and constraint-tight throughout.

Winner: DeepSeek wins for adhering to the prompt more precisely and wins on formatting and rule compliance.

8. Knowledge retrieval

Prompt: What were the main causes of the Bronze Age Collapse around 1200 BCE? Which theories are most supported by recent archaeological evidence?

Gemini delivered a vivid, evidence-rich explanation that clearly reflects the latest archaeological consensus, especially around climate-driven systems collapse. However, it’s far more detailed than necessary for the question and reads like a mini-essay rather than a concise answer.

DeepSeek presented a clear, structured overview that directly answers both parts of the question while accurately reflecting current archaeological support. It balanced depth and clarity without overextending.

Winner: DeepSeek wins for a more concise, better organize response that precisely aligned with the prompt.

9. Ambiguity handling

Prompt: I left my phone locked in the car with the keys. Can you help me?

Gemini gave a thorough, safety-first response with practical options and clear escalation steps, but it’s overly long for a simple lockout question. The details are helpful, yet it may overwhelm someone who just needs quick guidance.

DeepSeek delivered calm, concise and well-prioritized advice that covers safety, modern car features and next steps without unnecessary complexity. It’s easier to scan and act on in a stressful moment.

Winner: DeepSeek wins for a response better suited for an urgent, real-world situation.

Overall winner: DeepSeek

After nine prompts, DeepSeek proved to be the better tool when accuracy and structure matter most. It consistently delivered clean answers, respected constraints and avoided unnecessary verbosity — making it ideal for technical work, structured analysis, instruction-heavy tasks and moments where clarity matters most.

Gemini 3 Flash did well, too, performing best when the task benefits from interpretation, explanation or creativity.

I have to admit that I'm completely surprised by the results here. DeepSeek has been controversial in the past, but it stands out in many new ways that clearly make it competitive. DeepSeek might just be the chatbot to watch this year.

Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.

More from Tom's Guide

Back to Laptops

Apple

Asus

Dell

Lenovo

AMD Ryzen

Intel Core i5

Intel Core i7

8GB RAM

16GB RAM

24GB RAM

32GB RAM

32GB

64GB

128GB

256GB

512GB

1TB

2TB

13.3-inch

13.4-inch

14-inch

15-inch

Black

Blue

Gold

Grey

Silver

White

New

Refurbished

Showing 10 of 132 deals

Filters☰

Apple 13" MacBook Air M4 (2025)

(256GB Blue)

$999

$899

View Deal

Apple 15" MacBook Air M4 (2025)

(15-inch 256GB)

$1,199

View Deal

Dell XPS 13 (9380)

(16GB RAM Intel Core i7)

$124.99

View Deal

Lenovo Yoga Slim 7x (Gen 9)

(512GB OLED)

$1,075.79

$858.11

View Deal

Lenovo Chromebook Plus 14

(Grey)

Our Review

☆☆☆☆☆

$639.99

$549.99

View Deal

Asus ROG Zephyrus G14 (2025)

(14-inch 1TB)

Our Review

☆☆☆☆☆

$2,179

$1,799.99

View Deal

Apple 13" MacBook Air M4 (2025)

(256GB Silver)

$999

View Deal

Apple 15" MacBook Air M4 (2025)

$899

View Deal

Dell XPS 13 (2016)

(13.3-inch 256GB)

Our Review

☆☆☆☆☆

$755

View Deal

Lenovo Yoga Slim 7x (Gen 9)

(512GB Black)

$1,499

$1,059

View Deal

Amanda Caswell is one of today’s leading voices in AI and technology. A celebrated contributor to various news outlets, her sharp insights and relatable storytelling have earned her a loyal readership. Amanda’s work has been recognized with prestigious honors, including outstanding contribution to media.

Known for her ability to bring clarity to even the most complex topics, Amanda seamlessly blends innovation and creativity, inspiring readers to embrace the power of AI and emerging technologies. As a certified prompt engineer, she continues to push the boundaries of how humans and AI can work together.

Beyond her journalism career, Amanda is a long-distance runner and mom of three. She lives in New Jersey.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

1. Reasoning and logic

2. Creative writing

3. Code generation

4. Analysis & synthesis

5. Mathematical problem-solving

6. Ethical reasoning

7. Instruction following

8. Knowledge retrieval

9. Ambiguity handling

Overall winner: DeepSeek

More from Tom's Guide

Useful links