A Night of Acceleration: Zhipu’s GLM‑5 Debuts as MiniMax and DeepSeek Race to Keep Up

Three leading Chinese AI firms unveiled near‑simultaneous upgrades that signal a shift from demo‑level coding assistants to production‑oriented, agentic systems. Zhipu launched GLM‑5 as an open‑source foundation for long‑horizon engineering tasks, while MiniMax and DeepSeek pushed product and context upgrades aimed at real‑world throughput and extended interactions.

Black and white close-up of Lexus F Sport steering wheel, emphasizing luxury car interior design.

Key Takeaways

  • 1Zhipu announced and open‑sourced GLM‑5, positioning it as an 'Agentic Engineering' foundation rather than a chat model.
  • 2GLM‑5 reportedly grew to 744B parameters (40B active), trained on ~28.5T tokens, and introduced sparse attention and asynchronous RL; Zhipu cites top open‑source rankings and strong software‑engineering benchmarks.
  • 3MiniMax quietly placed an M2.5 model into its Agent product, claiming production‑grade agent and coding performance with only 10B activated parameters and high throughput (100 TPS).
  • 4DeepSeek extended its context window to around 1 million tokens and refreshed its knowledge cutoff to May 2025, signalling a focus on longer workflows.
  • 5The cluster of updates highlights a Chinese industry pivot to production, efficiency and long‑horizon agent behaviour, with immediate market and strategic implications.

Editor's
Desk

Strategic Analysis

This episode crystallises three strategic trends in the AI race: the movement from single‑turn coding assistants to multi‑step, resource‑aware agents; the premium on efficiency and deployability as much as raw parameter count; and the tactical value of quiet, product‑level rollouts that test adoption before formal announcements. If vendor claims hold up under independent scrutiny, enterprises will prioritise models that can sustain long interactions, call tools reliably and run at acceptable cost on domestic hardware. That raises secondary battles over chip adaptation, software stacks and compliance ecosystems. For policymakers and foreign firms, the immediate implication is that Chinese AI capacity is maturing in ways that complicate neat Western dominance narratives; for customers, it means choosing between incumbent cloud‑centric ecosystems and rapidly advancing, locally integrated alternatives. The competitive frontier will be interoperability, verified safety and who can turn agentic capabilities into dependable, regulated services first.

China Daily Brief Editorial
Strategic Insight
China Daily Brief

In a matter of hours on the night of Feb. 11–12, three of China’s best‑known large‑model companies pushed forward new versions or upgrades that together mark a step change in how domestic AI firms are thinking about production. Zhipu (智谱) formally unveiled and open‑sourced GLM‑5, MiniMax quietly surfaced a new M2.5 model inside its agent product, and DeepSeek upgraded core capabilities — a flurry of activity that crystallises a wider industry pivot from prototype demos to continuous, agentic engineering.

Zhipu positions GLM‑5 not as another chat model but as a foundation for “Agentic Engineering”: systems that can run multi‑step procedures, manage resources and deliver production‑grade results rather than single‑turn code snippets. The company said GLM‑5 is the anonymous “Pony Alpha” that had already been used by global developers to build games, agent worlds and full applications during testing, arguing that real‑world, brand‑free usage is evidence of genuine capability.

Technically, Zhipu reports substantial upgrades. GLM‑5’s total parameters rose to 744 billion with an activated parameter budget of 40 billion, up from a previous 355B (32B active), and pretraining data increased to about 28.5 trillion tokens. Zhipu also says it integrated DeepSeek’s sparse‑attention mechanism to preserve long‑context performance while lowering deployment cost, and introduced an asynchronous reinforcement framework called “Slime” to sustain learning over long interactions.

Benchmark claims are aggressive. Zhipu cites an Artificial Analysis ranking that places GLM‑5 fourth globally and first among open‑source models, and reports strong performance on software engineering and terminal tests — including a 77.8 on SWE‑bench‑Verified and 56.2 on Terminal Bench 2.0 — scores it says surpass Google’s Gemini 3 Pro and align the model with Claude Opus 4.5. In an evocative test called Vending Bench 2, GLM‑5 allegedly ran a simulated vending‑machine business for a year and finished with a $4,432 balance, a claim used to illustrate the model’s planning and resource‑management competence.

MiniMax’s move was less theatrical but equally consequential: users noticed a new “M2.5” model option in its Agent product before any formal announcement. Early testers describe powerful agent and coding capabilities at much lower compute cost. MiniMax markets M2.5 as a production‑grade agent model with just 10 billion activated parameters, high throughput (supporting 100 TPS) and efficiency advantages that make it suitable for cross‑platform office productivity tasks such as Excel, PPT and deep research.

DeepSeek, meanwhile, appears to be taking a different tack by quietly extending practical capabilities. The company’s context window reportedly jumped to 1 million tokens from 128K and its knowledge cutoff was updated to May 2025, signalling a focus on longer workflows and fresher information. The staggered cadence of these announcements suggests the sector is racing on multiple fronts: model scale, efficiency, contextual length and real‑world tool use.

The commercial response was immediate. Chinese AI equities and related chip stocks rallied on the news, and analysts and developers framed the night as evidence that Chinese models are now competing for production work previously thought to belong to Western incumbents. For buyers and builders, the shift matters: vendors that can deliver dependable, efficient agentic systems will have an edge in enterprise adoption, while those that remain at the demo stage risk being leapfrogged.

Yet important questions remain. Benchmarks supplied by vendors can be selective; operational robustness, security, safety guardrails and reproducible third‑party verification will determine which models actually flourish in enterprise and consumer deployments. The broader geopolitical context — export controls on advanced chips, international partnerships on data and research, and Western cloud providers’ reactions — will shape how quickly these models influence markets beyond China’s borders.

For now, the takeaway is speed. The simultaneous rollouts underscore how quickly China’s AI ecosystem is moving from proof‑of‑concept to production, and how competition is shifting from pure scale to a blend of efficiency, long‑context reasoning and continuous agentic behaviour. In that race, falling behind can happen in a single night.

Share Article

Related Articles

📰
No related articles found