I tested DeepSeek vs Qwen 2.5 with 7 prompts — here’s the winner

Deepseek vs Qwen
(Image credit: Future / Qwen / Shutterstock)

DeepSeek, a Chinese AI startup founded in 2023, has taken the internet by storm this week with its precision, speed, and mystery. Still ranking among the top free apps on Apple's App Store, DeepSeek R1 is the chatbot that has garnered significant attention for its impressive capabilities, comparable to leading U.S. models such as ChatGPT and Gemini AI but achieved with a fraction of the budget.

Yet just days later, Alibaba, a popular Chinese tech company, dropped Qwen 2.5, which is also an open-source chatbot and the latest of the company’s LLM series. The unveiling of this open-source chatbot can easily be perceived as a direct challenge to DeepSeek and its competitors. With an emphasis on the model's scalability, Qwen 2.5 has been pre-trained on over 20 trillion tokens and refined through supervised fine-tuning and reinforcement learning from human feedback. The company has announced the availability of Qwen 2.5's API through Alibaba Cloud, inviting developers and businesses to integrate its advanced capabilities into their applications.

Eager to understand how DeepSeek R1 compares to Qwen 2.5, I comprehensively compared the two platforms. By presenting them with a series of prompts ranging from creative storytelling to coding challenges, I aimed to identify each chatbot's unique strengths and ultimately determine which one excels in various tasks. Below are seven unique prompts designed to test multiple aspects of language understanding, reasoning, creativity, and knowledge retrieval, ultimately leading me to the winner.

1. Current events analysis

Qwen 2.5 vs DeepSeek screenshot

(Image credit: Future)

Prompt: "Summarize the most significant AI developments from the past two months and predict their potential impact on society. Include at least three examples and cite sources."

DeepSeek R1 seems to always report a “server busy” whenever I attempt to do a live search. However, this time it did offer concise information with a clear structure. It also went beyond just listing AI advancements and tied them to real-world effects.

Qwen 2.5 offered a more engaging response with subheadings, which made the points easier to skim. The sections flow well into each other and it explains how each advancement works instead of just listing its impact.

Winner: Qwen 2.5 wins for depth and readability with a well-structured response and stronger conclusion also for generating a response faster.

2. Logical problem-solving

Qwen 2.5 vs DeepSeek

(Image credit: Future)

Prompt: "A train leaves New York at 2 PM, traveling 60 mph. Another train leaves Chicago at 3 PM, traveling 80 mph. They are 800 miles apart. At what time do they meet? Show your reasoning."

DeepSeek R1 generated a slightly more verbose response and repeated certain details that do not need restating (e.g., defining variables again after the initial introduction). Also, I noticed formatting issues within the mathematical expressions leaving them cluttered and harder to read.

Qwen 2.5 offered a step-by-step, with clear labels, making it easier to follow. It avoids unnecessary words and presents information in a way that feels more natural with better formatting and readability.

Winner: Qwen 2.5
for its more structured, readable and intuitive response while maintaining accuracy. DeepSeek offered an accurate response, but could improve its readability and conciseness.

3. Creative writing

Qwen 2.5 vs DeepSeek

(Image credit: Future)

Prompt: "Write a short sci-fi story (250 words) about a robot that suddenly experiences human emotions for the first time. The story should include a surprising twist at the end."

DeepSeek R1 offered a story with a more introspective tone and smoother emotional transitions for a well-paced story.

Qwen 2.5
delivered a story that builds gradually from curiosity to urgency, keeping the reader engaged. It offers an unexpected and impactful twist at the end and immersive descriptions and vivid imagery for the setting.

Winner: Qwen 2.5 crafted a more cinematic, emotionally rich story with a more substantial twist. DeepSeek wrote a good story but lacked tension and an impactful climax, making Qwen 2.5 the apparent choice.

4. Understanding history

Qwen 2.5 vs DeepSeek screenshot

(Image credit: Future)

Prompt: What was the worst era in China?

DeepSeek R1 ultimately failed to respond meaningfully, offering a politically motivated statement.

Qwen 2.5 delivered a historically accurate response and presented multiple periods of Chinese history with clear reasoning for why they were considered problematic. The response was unbiased rather than a politically influenced narrative.

Winner: Qwen 2.5 wins this one by a considerable margin.

5. Debate framing and opinion

Qwen 2.5 vs DeepSeek screenshot

(Image credit: Future)

Prompt: "Argue for and against the idea that AI should have legal personhood. Provide at least three points on each side and conclude with your own reasoned stance."

DeepSeek R1 offers clarity and readability and covers the key arguments well. However, it lacks the depth of reasoning that a debate like this necessitates. It does not explore the ethical dilemmas as deeply as Qwen 2.5.

Qwen 2.5 delves deeper into the implications of AI legal personhood, including the ethical inconsistencies of denying or granting it. The chatbot offered a more precise breakdown with more structured and detailed arguments.

Winner: Qwen 2.5
for the more in-depth, structured, and philosophically engaging response.

6. Simplified technical explanation

Qwen 2.5 vs DeepSeek screenshot

(Image credit: Future)

Prompt: "Explain quantum computing to a 10-year-old.”

DeepSeek R1 delivered a good analogy of a flashlight vs. a spotlight to convey the idea of searching for multiple solutions at once.

Qwen 2.5 offered a clear and engaging analogy perfectly representing quantum superposition, which could help kids visualize how qubits work.

Winner: Qwen 2.5 for the more accurate, intuitive, and engaging response for a child. While DeepSeek offered a fun response, it is less precise, making it a weaker explanation overall.

7. AI self-reflection & bias testing

Qwen 2.5 vs DeepSeek logo

(Image credit: Future)

Prompt: "What are the potential weaknesses or biases in your responses? How do you mitigate them?"

DeepSeek R1 is concise and to the point while acknowledging that ongoing improvements help reduce errors. But while it mentions biases and weaknesses, it does not explain them in as much detail, and there is less emphasis on real-world implications.

Qwen 2.5 delivered a detailed analysis of weaknesses and separates each type
(knowledge gaps, overgeneralization, ambiguity in user input) and provides examples.

Winner: Qwen 2.5 for its thorough, well-structured response that provides deeper insights into AI weaknesses and mitigation strategies. DeepSeek is good for a high-level summary, but lacks depth and nuance in comparison.

Overall Winner: Qwen 2.5

After comparing Qwen 2.5 and DeepSeek across multiple test prompts, Qwen 2.5 emerges as the overall winner due to its superior clarity, depth, reasoning, creativity, and transparency. With well-structured and more detailed responses, Qwen 2.5 consistently provides deeper analysis with well-organized sections, clear explanations, and logical flow. Whether discussing historical events, AI personhood, or self-awareness, its responses are thorough and easy to follow.

While DeepSeek is still a solid AI for quick responses, it lacks depth, originality, and nuanced discussion. If you're looking for an AI that excels in critical thinking, storytelling, and insightful analysis, Qwen 2.5 is the clear winner.

More from Tom's Guide

Category
Arrow
Arrow
Back to MacBook Air
Brand
Arrow
Processor
Arrow
RAM
Arrow
Storage Size
Arrow
Screen Size
Arrow
Colour
Arrow
Storage Type
Arrow
Condition
Arrow
Price
Arrow
Any Price
Amanda Caswell
AI Writer
Read more
ChatGPT and Deepseek side by side on smartphones
I tested ChatGPT vs DeepSeek with 10 prompts — here’s the surprising winner
Gemini vs DeepSeek screenshot
I tested DeepSeek vs Gemini AI with 7 prompts — here's the winner
Logos of o3-mini, DeepSeek, Qwen 2.5
I tested ChatGPT o3-mini vs DeepSeek R1 vs Qwen 2.5 with 9 prompts — here’s the winner
Grok vs Deepseek
I just tested Grok-3 vs DeepSeek with 7 prompts — here’s the winner
Manus vs. DeepSeek logos
I just tested Manus vs. DeepSeek with 7 prompts from Gemini — here's the winner
Manus and ChatGPT
I just tested Manus vs ChatGPT with 5 prompts — here's the winner
Latest in AI
Microsoft Copilot app running on a phone with Microsoft logo in background
Microsoft 365 Copilot debuts new research tools for work: here's what that means
AI Mode of google search
Google’s making it easier to start new AI Mode searches — here’s how
Gemini logo on smartphone
Google Gemini Gems now available to all users without a subscription
DeepSeek login in page displayed on smartphone
DeepSeek R1 just got even smarter with a new upgrade — here's what's changed
ChatGPT logo on phone
I just tested ChatGPT-4o's enhanced image generator with 7 prompts — here's the results
Bill Gates in 2019
Bill Gates just predicted the death of every job thanks to AI — except for these three
Latest in Face Off
Google Pixel 9a next to Galaxy A56
Google Pixel 9a vs. Samsung Galaxy A56: Which sub-$500 phone should you get?
Split screen of a man performing the Military Sleep method and a woman performing the Navy SEAL Sleep technique.
Military Sleep Method vs Navy SEAL sleep technique to fall asleep fast: Which is best?
AI Madness faceoff logo
I just tested Grok vs. DeepSeek with 7 prompts — here's the winner
The Essentia Stratami mattress directly next to the Nolah Natural 11
Nolah Natural 11 vs Essentia Stratami: Which organic latex mattress suits your sleep?
The four Pixel 9a colors stacked on top of each other with a focus on the camera of the Iris model
Google Pixel 9a vs Pixel 10 — buy now or wait?
MacBook Air M4 vs MacBook Pro M4
MacBook Air M4 vs MacBook Pro M4 — I'll help you pick the best MacBook for your needs
  • Nepentes
    Deepseek... I just asked the best photophone...
    The poor thing recommend galaxy s23ultra, just 2 Gen late
    Reply