Chinese language tech firm Alibaba on Monday launched Qwen 3, a household of AI fashions the corporate claims matches and in some circumstances outperforms the very best fashions out there from Google and OpenAI.
A lot of the fashions are — or quickly shall be — out there for obtain below an “open” license from AI dev platform Hugging Face and GitHub. They vary in dimension from 0.6 billion parameters to 235 billion parameters. Parameters roughly correspond to a mannequin’s problem-solving expertise, and fashions with extra parameters usually carry out higher than these with fewer parameters.
The rise of China-originated mannequin collection like Qwen have elevated the strain on American labs akin to OpenAI to ship extra succesful AI applied sciences. They’ve additionally led policymakers to implement restrictions geared toward limiting the power of Chinese language AI corporations to acquire the chips obligatory to coach fashions.
Introducing Qwen3!
We launch and open-weight Qwen3, our newest giant language fashions, together with 2 MoE fashions and 6 dense fashions, starting from 0.6B to 235B. Our flagship mannequin, Qwen3-235B-A22B, achieves aggressive leads to benchmark evaluations of coding, math, normal… pic.twitter.com/JWZkJeHWhC
— Qwen (@Alibaba_Qwen) April 28, 2025
In line with Alibaba, Qwen 3 fashions are “hybrid” fashions within the sense that they will take time and “reason” by means of advanced issues or reply easier requests shortly. Reasoning allows the fashions to successfully fact-check themselves, much like fashions like OpenAI’s o3, however at the price of greater latency.
“We have seamlessly integrated thinking and non-thinking modes, offering users the flexibility to control the thinking budget,” wrote the Qwen workforce in a weblog put up. “This design enables users to configure task-specific budgets with greater ease.”
A number of the fashions additionally undertake a combination of specialists (MoE) structure, which may be extra computationally environment friendly for answering queries. MoE breaks down duties into subtasks and delegates them to smaller, specialised “expert” fashions.
The Qwen 3 fashions help 119 languages, Alibaba says, and had been educated on a knowledge set of practically 36 trillion tokens. Tokens are the uncooked bits of knowledge {that a} mannequin processes; 1 million tokens is equal to about 750,000 phrases. Alibaba says that Qwen 3 was educated on a mixture of textbooks, “question-answer pairs,” code snippets, AI-generated knowledge, and extra.
These enhancements, together with others, vastly boosted Qwen 3’s capabilities in comparison with its predecessor, Qwen 2, says Alibaba. Not one of the Qwen 3 fashions are head and shoulders above top-of-the-line current fashions like OpenAI’s o3 and o4-mini, however they’re sturdy performers nonetheless.
On Codeforces, a platform for programming contests, the most important Qwen 3 mannequin — Qwen-3-235B-A22B — simply beats out OpenAI’s o3-mini and Google’s Gemini 2.5 Professional. Qwen-3-235B-A22B additionally bests o3-mini on the newest model of AIME, a difficult math benchmark, and BFCL, a take a look at for assessing a mannequin’s capacity to “reason” about issues.
However Qwen-3-235B-A22B isn’t publicly out there — not less than not but.
The biggest public Qwen 3 mannequin, Qwen3-32B, remains to be aggressive with quite a few proprietary and open AI fashions, together with Chinese language AI lab DeepSeek’s R1. Qwen3-32B surpasses OpenAI’s o1 mannequin on a number of exams, together with the coding benchmark LiveCodeBench.
Alibaba says Qwen 3 “excels” in tool-calling capabilities in addition to following directions and copying particular knowledge codecs. Along with the fashions for obtain, Qwen 3 is on the market from cloud suppliers together with Fireworks AI and Hyperbolic.
Tuhin Srivastava, co-founder and CEO of AI cloud host Baseten, stated that Qwen 3 is one other level within the development line of open fashions retaining tempo with closed-source programs akin to OpenAI’s.
“The U.S. is doubling down on restricting sales of chips to China and purchases from China, but models like Qwen 3 that are state-of-the-art and open […] will undoubtedly be used domestically,” he informed TechCrunch. “It reflects the reality that businesses are both building their own tools [as well as] buying off the shelf via closed-model companies like Anthropic and OpenAI.”