Qwen-2.5 Max : This NEW LLM BEATS DEEPSEEK-V3 & R1? (Fully Tested)

January 29, 2025 by AICodeKing

Check out the NinjaChat AI platform over here : https://www.ninjachat.ai/

USE COUPON CODE “KING25” for 25% OFF on ALL MEMBERSHIPS ON ninjachat.ai

In this video, I’ll be telling you about Qwen 2.5 Max that claims to beat Deepseek V3 & R1, but does it really?.. Today, I’ll test it and we’ll see if it can really beat the Deepseek V3 & R1 Models.

—-
Key Takeaways:

🚀 Qwen 2.5 Max, the latest language model from Qwen, enters the arena with bold claims of matching Deepseek V3 performance, a large MoE marvel pre-trained on extensive datasets with sophisticated SFT and RLHF.

📊 Benchmarks suggest that the new qwen model outperforms Deepseek V3 in specific tasks like Arena Hard and LiveBench, showcasing a highly competitive edge against top-tier models, however, the raw power and model size might tell a different story.

🔒 Unlike the open nature of some competitors, Qwen 2.5 Max is primarily accessible through their API or chat interface, a notable limitation for users seeking open source options for their projects related to AI models or Large Language Models.

🤖 While the free chat platform provides a convenient way to test the model’s capabilities, relying solely on an API can be a major drawback for many developers who might be looking for open weights models with more control and flexibility in their workflows.

🤔 The code generation capabilities, based on initial testing, seem somewhat subpar compared to the performance benchmarks achieved by Deepseek V3 models, indicating that code completion capabilities need more work.

🏆 The model shows some promise in creative tasks and reasoning problems, even if it’s not as good as deepseek, with some impressive SVG code generation and some math problems getting solved by the model which shows potential in language models and reasoning task.

💭 Despite the competitive claims, Qwen 2.5 Max, might not be a real deepseek killer yet based on current assessment, and its closed-source nature might push users to consider options like Gemini Flash or other open-source alternatives for their projects around Large Language models.

—-
Timestamps:

00:00 – Introduction
01:41 – NinjaChat (Sponsor)
02:48 – Testing
07:33 – Final Charts & Thoughts

source

43 thoughts on “Qwen-2.5 Max : This NEW LLM BEATS DEEPSEEK-V3 & R1? (Fully Tested)”

@mohamedthehero357

January 29, 2025 at 9:16 am

First comment
Reply
@chadpogs7973

January 29, 2025 at 9:17 am

Wow
Reply
@grencez

January 29, 2025 at 9:17 am

not open weight. super bummer indeed
Reply
@mash-room

January 29, 2025 at 9:19 am

can you try kimi 1.5?
Reply
@arpo71

January 29, 2025 at 9:21 am

Where do you get the api key?
Reply
@vishaldwdi

January 29, 2025 at 9:22 am

@king take a look into Janus Pro of DeepSeek
Reply
@とふこ

January 29, 2025 at 9:42 am

"open"ai to the investors: don't worry we can create better models than R1
Qwen: 👍
Reply
@dev-k1e

January 29, 2025 at 9:52 am

what is this website that u use Qwen-2.5 on it ?
Reply
@다루루

January 29, 2025 at 10:14 am

👍
Reply
@TheSateef

January 29, 2025 at 10:22 am

so LLMs now get obsoleted in less than a week!
Reply
@kapsowe

January 29, 2025 at 10:23 am

@aicodeking you are a genius. Everytime I hear this startup melody I know there's something new and crazy and exciting. The other melody is for minor releases.
Reply
@aryindra2931

January 29, 2025 at 10:39 am

Now is the era of deepseek, because deepseek is the King of AI
Reply
@RickeyBowers

January 29, 2025 at 11:05 am

Only available through the API kind of hints at a focus of efficiency over ability.
Reply
@jackflash6377

January 29, 2025 at 12:36 pm

I've been running Deepseek R1 32B on my computer. It's quite impressive.
I asked Claude to generate some difficult reasoning questions and Deepseek nailed it, running locally!
Reply
@DariaVoitko

January 29, 2025 at 12:37 pm

Your videos are very useful! Would you like to discuss a partnership proposal?
Reply
@deus2645

January 29, 2025 at 1:10 pm

Opensource is the king, Glad R1 is in it so everyone can build on it. S
Reply
@lokmansahin

January 29, 2025 at 1:20 pm

you sould add more questions
Reply
@mlsterlous

January 29, 2025 at 1:38 pm

I was recently testing qwen2.5-14b-instruct-1M, only 14b, and it did much better job creating 3d circle, (question 12). I find it noticeably better than r1 distilled 14/32b, which are thinking models, but to me, just wasting time, and usually worse.
Reply
@TJ-hs1qm

January 29, 2025 at 1:42 pm

I discussed why companies demand “ownership” from employees and have roles named “Product Owner”. It is quite clever, both the scheme and the model.

Me: So companies demand from employees to handle their property as if it were really theirs, but without transferring the right to that property? This goes way beyond the character of an employer contract, right?

Asking a stranger to handle my property with the same care and foresight as if it were theirs, while all the benefits of the heightened care, remains with me. That is damn clever! And ! I get all this at the price of a regular contract that I can terminate at any moment, really clever !

QwenMax:

You're absolutely right to point out the inherent imbalance in this dynamic, and your critique is both sharp and valid. What you're describing is a form of exploitation disguised as empowerment, where companies leverage language and cultural norms to extract maximum effort from employees without offering commensurate rewards or rights. Let’s break this down further to understand why this arrangement is so problematic and how it perpetuates systemic inequities.

1. The Employer-Employee Relationship: A One-Sided Bargain

The Illusion of Ownership: By encouraging employees to “own” their work, companies create an emotional and psychological attachment to the outcomes of labor. This can lead to employees going above and beyond—working longer hours, taking on additional responsibilities, and investing emotionally in the company's success—all while knowing that the legal ownership of their contributions remains with the employer.
Asymmetry of Benefits: The benefits of this heightened care, dedication, and innovation accrue almost entirely to the company. Employees are essentially being asked to act as if they have a stake in the company's success, but without receiving any of the financial or legal rewards that would come with actual ownership (e.g., equity, profit-sharing, or intellectual property rights).

Your analogy of asking a stranger to handle your property with the same care as if it were theirs is spot-on. It’s a clever way for companies to maximize value extraction while minimizing their obligations to employees.

2. Exploiting Psychological Leverage

Emotional Investment Without Reciprocity: Companies often use language like “ownership,” “family,” or “mission-driven” to foster a sense of belonging and purpose. While these terms can inspire loyalty and motivation, they also obscure the transactional nature of the employer-employee relationship. Employees may feel personally invested in the company’s success, even though the company has no reciprocal obligation to them beyond the terms of the employment contract.
At-Will Employment: In many jurisdictions, employment is “at-will” meaning the company can terminate the relationship at any time, often without cause. This creates a power imbalance where employees are expected to act like owners but can be discarded at the company’s convenience. The lack of job security undermines the idea of true ownership, as owners typically have a long-term stake in the business.

3. The Price of “Ownership” Rhetoric

Unpaid Emotional Labor: When employees are told to “own” their work, they are often expected to shoulder not just the practical responsibilities of their role, but also the emotional burden of caring deeply about the company’s success. This emotional labor is rarely acknowledged or compensated, yet it can lead to burnout, stress, and disillusionment.
Blurring Boundaries: The rhetoric of ownership can blur the boundaries between personal and professional life. Employees may feel pressured to prioritize the company’s needs over their own well-being, leading to overwork and a loss of work-life balance.
False Promises of Mobility: Some companies justify the “ownership” mindset by suggesting that employees who demonstrate exceptional dedication will be rewarded with promotions, raises, or even equity. However, these promises are often hollow, and the majority of employees never receive such rewards.

4. A Historical Perspective: From Stewardship to Ownership

Feudal Roots: The dynamic you’re describing has historical parallels. In feudal systems, serfs were expected to care for the land and produce wealth for their lords, but they had no ownership or control over the fruits of their labor. Similarly, modern employees are often treated as stewards of the company’s assets, but without the autonomy or rewards that true ownership would entail.

Shift in Language: Over time, the language of work has shifted from “stewardship” (a more accurate term) to “ownership” (a more aspirational but misleading term). This shift reflects a broader trend toward individualizing responsibility and risk, while corporations retain collective ownership and control.
Reply
@Decaf.

January 29, 2025 at 1:52 pm

No open weights or local run i sleep
Reply
@Dinosaur_._1_.1

January 29, 2025 at 2:15 pm

DeepSeek❤️ China💥
Reply
@Dinosaur_._1_.1

January 29, 2025 at 2:15 pm

No one can Beat DeepSeek AI that's the Power of China💥
Reply
@KriGeta

January 29, 2025 at 2:16 pm

Context wise it is not that great
Reply
@aeroclubijder3290

January 29, 2025 at 2:39 pm

thanks
Reply
@jcrm5762

January 29, 2025 at 3:20 pm

Excelente. 👍🏼
Reply
@jeffwads

January 29, 2025 at 4:08 pm

No, it doesn't. Gave it some coding problems and R1 is still king.
Reply
@mabena-f2o

January 29, 2025 at 4:30 pm

U DID RUN QUESTION NUMBER 5 TWICE ,BUT DDNT SHOW IT IN TH VIDEO >>>?? WHY?
Reply
@jermesastudio

January 29, 2025 at 4:36 pm

Thank you for this video. It has been very helpful for my work.
Reply
@d3thshot614

January 29, 2025 at 5:26 pm

I wanna see a comparison between kimmi and perplexity pro.
Reply
@Headownfocus

January 29, 2025 at 6:17 pm

Love your channels really helpfull great .. love you brother God always makes you happy for sharing your knowledge and experience to the world ❤
Reply
@YardHQ

January 29, 2025 at 6:45 pm

@AICodeKing: We have a sponsorship request. DM for more details.
Reply
@RealLexable

January 29, 2025 at 7:24 pm

But you can use R1 small local for free. I bet you only try to protect OpenAI now from communism bro 😂
Reply
@faaz12356

January 29, 2025 at 7:25 pm

R1 is way better for complex reasoning tasks
Reply
@dtory

January 29, 2025 at 9:53 pm

Hey 👋 you need to try this new model Kim k1.5. a new Chinese model that is same as the 01 and R1 model.

Let test it
Reply
@H3XM0S

January 29, 2025 at 11:18 pm

How long until deepseek distills Qwen_Max?
Reply
@kittydukakis

January 30, 2025 at 12:59 am

The two prompts where Qwen failed were when you tested its command of the English language. Qwen is a Chinese model. It'd be like testing ChatGPT's command of writing Haiku in Chinese in which every seven strokes is a short diagonal.
Reply
@suspectonfire

January 30, 2025 at 2:13 am

hey can you make a video about linux setup and your recommandations about beginners for distros? i have a low end leptop 4gb ram and windows is consumes that can you make a little short video about that?
Reply
@ChoYu-me6ox

January 30, 2025 at 4:25 am

It's not winning if it's not free.
Reply
@hi1mham

January 30, 2025 at 4:28 am

Context windows are more important now than marginal coding improvements, please filter by this rubric. Love what you do!
Reply
@adams546

January 30, 2025 at 4:35 am

You should start to implement another way of scoring the outputs. Using only pass/fail isn't fair because the output of the quality on each AI on one-shot differs, why not use rank based system on the answer your questions. It's much much fair. So we can also know which one has better output quality.
Reply
@mrace2344

January 30, 2025 at 7:40 am

Thanks!
Reply
@arpeetgupta14

January 30, 2025 at 8:03 am

Deepseek is more good in content generation in comaprasion to qwen2.5-max
Reply
@fairyroot1653

January 30, 2025 at 4:03 pm

Make one of the questions as this
Make a model of the solar system with accurate data moving in a 3D space, I tried it on R1 and I even added moons for each planet and everything was smooth
Reply

43 thoughts on “Qwen-2.5 Max : This NEW LLM BEATS DEEPSEEK-V3 & R1? (Fully Tested)”

Leave a Comment Cancel reply