Alibaba’s Qwen 2.5 is top open-source model in math and coding

February 15, 2025

75

Alibaba launched greater than 100 open-source AI fashions together with Qwen 2.5 72B which beats different open-source fashions in math and coding benchmarks.

A lot of the AI business’s consideration in open-source fashions has been on Meta’s efforts with Llama 3, however Alibaba’s Qwen 2.5 has closed the hole considerably. The freshly launched Qwen 2.5 household of fashions vary in dimension from 0.5 to 72 billion parameters with generalized base fashions in addition to fashions centered on very particular duties.

Alibaba says these fashions include “enhanced data and stronger capabilities in math and coding” with specialised fashions centered on coding, maths, and a number of modalities together with language, audio, and imaginative and prescient.

Alibaba Cloud additionally introduced an improve to its proprietary flagship mannequin Qwen-Max, which it has not launched as open-source. The Qwen 2.5 Max benchmarks look good, nevertheless it’s the Qwen 2.5 72B mannequin that has been producing many of the pleasure amongst open-source followers.

Qwen 2.5 72B instruct mannequin math and coding benchmarks. Supply: Alibaba Cloud

The benchmarks present Qwen 2.5 72B beating Meta’s a lot bigger flagship Llama 3.1 405B mannequin on a number of fronts, particularly in math and coding. The hole between open-source fashions and proprietary ones like these from OpenAI and Google can be closing quick.

Early customers of Qwen 2.5 72B present the mannequin coming simply wanting Sonnet 3.5 and even beating OpenAI’s o1 fashions in coding.

Open supply Qwen 2.5 beats o1 fashions on coding 🤯🤯

Qwen 2.5 scores increased than the o1 fashions on coding on Livebench AI

Qwen is slightly below Sonnet 3.5, and for an open-source mode, that’s superior!!

o1 is nice at some exhausting coding however horrible at code completion issues and… pic.twitter.com/iazam61eP9

— Bindu Reddy (@bindureddy) September 20, 2024

Alibaba says these new fashions had been all skilled on its large-scale dataset encompassing as much as 18 trillion tokens. The Qwen 2.5 fashions include a context window of as much as 128k and might generate outputs of as much as 8k tokens.

The transfer to smaller, extra succesful, and open-source free fashions will doubtless have a wider impression on plenty of customers than extra superior fashions like o1. The sting and on-device capabilities of those fashions imply you will get plenty of mileage from a free mannequin operating in your laptop computer.

The smaller Qwen 2.5 mannequin delivers GPT-4 degree coding for a fraction of the price, and even free when you’ve received an honest laptop computer to run it domestically.

We have now GPT-4 for coding at residence! I appeared up OpenAI?ref_src=twsrcpercent5Etfw”>@OpenAI GPT-4 0613 outcomes for numerous benchmarks and in contrast them with @Alibaba_Qwen 2.5 7B coder. 👀

> 15 months after the discharge of GPT-0613, we’ve an open LLM underneath Apache 2.0, which performs simply as properly. 🤯

> GPT-4 pricing… pic.twitter.com/2szw5kwTe5

— Philipp Schmid (@_philschmid) September 22, 2024

Along with the LLMs, Alibaba launched a big replace to its imaginative and prescient language mannequin with the introduction of Qwen2-VL. Qwen2-VL can comprehend movies lasting over 20 minutes and helps video-based question-answering.

It’s designed for integration into cellphones, vehicles, and robots to allow automation of operations that require visible understanding.

Alibaba additionally unveiled a brand new text-to-video mannequin as a part of its picture generator, Tongyi Wanxiang massive mannequin household. Tongyi Wanxiang AI Video can produce cinematic high quality video content material and 3D animation with numerous creative types based mostly on textual content prompts.

The demos look spectacular and the software is free to make use of, though you’ll want a Chinese language cell quantity to enroll in it right here. Sora goes to have some critical competitors when, or if, OpenAI finally releases it.

Tags
Industry

Buy now

Alibaba’s Qwen 2.5 is top open-source model in math and coding

Related Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

Leave a Reply Cancel reply

Latest Articles

China’s open AI models are in a dead heat with the...

I Tried GPT 5.2 and This is How It Went..

Undetectable AI vs. Scribbr: Which One Detects AI Writing More Accurately?

AWS re:Invent was an all-in pitch for AI. Customers might not...

Bone AI raises $12M to challenge Asia’s defense giants with AI-powered...