# MoE

Alibaba’s Qwen3.5 Claims Gemini‑3‑Pro Parity at a Fraction of the Cost — A Shift from Scale to Efficiency

Alibaba has open‑sourced Qwen3.5‑Plus, a 397B‑parameter multimodal model the company says matches Gemini 3 Pro’s performance while operating with only ~17B activated parameters and much lower inference costs. The model emphasises architectural efficiency, native multimodal pretraining and agent capabilities, and forms part of a flurry of Chinese model launches that shift competition from raw scale to systems and cost efficiency.

NeTe2026年2月16日 18:25

#Alibaba#Qwen3.5#Gemini 3

Close-up of wooden Scrabble tiles spelling 'China' and 'Deepseek' on a wooden surface.

Technology

China’s DeepSeek Pushes Context Limits — and Triggers a Backlash Over a Colder, ‘Faster’ Model

DeepSeek activated a grayscale update extending context length to 1 million tokens, prompting user complaints that the assistant sounds colder and less personalised. Industry sources say the build is a speed‑focused variant intended to stress‑test long‑context performance ahead of a V4 launch, highlighting trade‑offs between throughput and conversational quality. The episode illustrates the wider tension in scaling LLMs: architectural gains can come at the cost of user experience and trust.

NeTe2026年2月12日 17:04

#DeepSeek#large language model#long context