# large language model

Latest news and articles about large language model

Total: 6 articles found

Abstract representation of large language models and AI technology.

ByteDance Elevates 'Doubao' to 2.0 — A Productised, Multi‑variant Large Model Push with Bold Benchmark Claims

ByteDance has launched Doubao 2.0, a family of production‑oriented large models (Pro, Lite, Mini, and a Code variant) focused on efficient inference, multimodal ability and long‑chain task execution. The company claims strong benchmark performance versus rivals but faces scrutiny over IP and international review requests, highlighting the tension between rapid commercialisation and the need for independent validation and regulatory compliance.

NeTe2026年2月14日 07:54

#ByteDance#Doubao 2.0#large language model

Close-up of a smartphone with AI assistant interface on screen over a laptop.

Technology

Chinese AI Lab DeepSeek Trials 1‑Million‑Token Context Window in App — API Still Capped at 128K

DeepSeek is testing a new long‑context model in its web and app interfaces that supports roughly one million tokens, while its public API remains limited to 128K token context on version 3.2. The trial highlights the commercial and technical trade‑offs involved in bringing ultra‑long context windows to production and signals intensifying competition in China’s AI landscape.

NeTe2026年2月13日 13:04

#DeepSeek#long context#1M tokens

Close-up of wooden Scrabble tiles spelling 'China' and 'Deepseek' on a wooden surface.

Technology

China’s DeepSeek Pushes Context Limits — and Triggers a Backlash Over a Colder, ‘Faster’ Model

DeepSeek activated a grayscale update extending context length to 1 million tokens, prompting user complaints that the assistant sounds colder and less personalised. Industry sources say the build is a speed‑focused variant intended to stress‑test long‑context performance ahead of a V4 launch, highlighting trade‑offs between throughput and conversational quality. The episode illustrates the wider tension in scaling LLMs: architectural gains can come at the cost of user experience and trust.

NeTe2026年2月12日 17:04

#DeepSeek#large language model#long context

Modern abstract 3D render showcasing a complex geometric structure in cool hues.

Technology

Alibaba’s Qwen3‑Max‑Thinking: China’s Latest Push to Match Western ‘Thinking’ Models

Alibaba has launched Qwen3‑Max‑Thinking, a flagship inference model its team says matches top Western 'thinking' models on key benchmarks. The model emphasizes reasoning, instruction following and agent capabilities and is aimed at commercial integration across Alibaba's cloud and services. The announcement underscores China’s accelerating push to develop indigenous, production‑ready large language models, though benchmark claims require independent validation.

NeTe2026年1月26日 16:10

#Alibaba#Qwen3‑Max‑Thinking#large language model

Technology

Alibaba Unveils Qwen3‑Max‑Thinking, a Trillion‑Parameter Inference Model Aimed at Beating Western Rivals

Alibaba has released Qwen3‑Max‑Thinking, a trillion‑parameter inference model it says surpasses leading Western models on multiple benchmarks, with stronger agent tool‑calling and reduced hallucinations. The company is opening trials on PC and web, positioning the model for broad commercial use while leaving independent verification of its claims outstanding.

NeTe2026年1月26日 16:10

#Alibaba#Qwen3‑Max‑Thinking#large language model

Health

JD Health Unveils 'Zhuoyi' 2.0 — A Push to Embed Large‑Scale AI Inside Chinese Hospitals

JD Health unveiled Zhuoyi 2.0, a hospital‑oriented large‑model product designed for full‑scene clinical deployment. The launch highlights Chinese tech firms’ drive to embed generative AI into hospital workflows while raising questions about clinical validation, regulation and data governance.

NeTe2026年1月20日 11:30

#JD Health#Zhuoyi#medical AI