As Models Mature, Chinese AI Firms Put Data — Not Parameters — at the Centre of Industrialisation

At the 2026 AIGC Developers Conference, Hangzhou Cangjie Intelligent’s CEO argued that industrial AI’s bottleneck is data, not models. Companies that build scalable, task‑structured, reusable data systems will gain the competitive moats required to deploy AI in manufacturing, robotics and embodied systems.

Young woman presenting on digital evolution concepts like AI and big data in a seminar.

Key Takeaways

  • 1Hangzhou Cangjie Intelligent presented at AIGC 2026, arguing data — not model size — is the core infrastructure for industrial AI.
  • 2In sectors like manufacturing, robotics and autonomous driving, deployment depends on high‑quality, task‑structured, continuously producible data.
  • 3Cangjie’s strategy focuses on multimodal dataset construction and data engineering to create reusable data assets rather than chasing larger models.
  • 4Industry consensus at the conference highlighted building data moats, converting data from a cost into an asset, and closing the loop between data, models and business outcomes.

Editor's
Desk

Strategic Analysis

The emphasis on data signals a strategic pivot across the AI ecosystem. As foundation models standardise and compute becomes more accessible, marginal gains from novel architectures will shrink while the value of curated, operational datasets will rise. This favours incumbents and specialised providers that control domain‑specific sensor fleets, annotation workflows and telemetry loops — assets that are hard to replicate quickly. For China, with its dense manufacturing base and active robotics sector, the prize is significant: firms that build interoperable data infrastructure could export not only models but the data‑driven processes that make those models reliable in production. Globally, expect a wave of investment into data‑ops platforms, tighter integration between hardware and software vendors, and regulatory scrutiny over how industrial data is collected and monetised. The AI race’s next frontier will be less about parameter counts and more about who can sustainably turn real‑world interactions into structured, verifiable, long‑lived assets.

China Daily Brief Editorial
Strategic Insight
China Daily Brief

At the AIGC Developers Conference 2026 in Hangzhou, Zhang Xiaobo, CEO of Hangzhou Cangjie Intelligent Technology, argued that the next phase of artificial intelligence will be won or lost on data. After two years of feverish attention on model size and multimodal capability, Zhang told an industry audience that developers and enterprises increasingly confront a blunt reality: models have advanced faster than the data systems needed to deploy them at scale in the real world.

Zhang drew a line between laboratory performance and industrial deployment, noting that in domains such as manufacturing, robotics, embodied intelligence and autonomous driving the limiting factor is not a lack of models but a lack of high‑quality, task‑structured data. For these applications models must be trained and continuously refined on data that faithfully reflects messy, dynamic physical environments, and that data must be engineered to be verifiable, reusable and producible on an ongoing basis.

Cangjie Intelligent’s answer is deliberate and infrastructural. Rather than racing for ever‑larger models, the company has prioritised building scalable data production systems: multimodal data engineering, complex‑scene dataset construction and task‑oriented annotation standards tailored to embodied and industrial use cases. The aim is to convert ephemeral labelled examples into reusable data assets that can feed training, inference and on‑line improvement over time.

That emphasis reflects a broader industry reassessment: as models and compute become commoditised, the competitive moat returns to data engineering. Valuable datasets are not merely large; they are structured around concrete tasks and operational standards, they can be measured and iterated upon, and they integrate into a closed loop with application telemetry and model retraining. When data is treated as an engineered asset rather than a one‑off cost, model performance gains become durable and cumulative.

Panel discussions at the conference homed in on four recurring questions: how firms can build their own data moat after the arrival of big models; what new data regimes verticals and embodied systems demand; how to reframe data from a cost centre into a strategic asset; and how to create a positive flywheel linking data, models and business outcomes. These conversations underline a practical shift in industrial AI strategy away from experimenting with generic models and toward investing in domain‑specific data pipelines and standards.

For international observers, the message matters because it reframes where capital, talent and partnerships will flow over the next phase of AI deployment. Expect increased investment in data‑ops tooling, dataset marketplaces, sensor networks and annotation platforms; more M&A activity around proprietary industrial datasets; and growing attention to governance, privacy and cross‑border data rules that will shape how these assets can be shared. If Zhang is right, the second half of the AI race will look less like a contest of model architectures and more like a competition to build durable, industrial‑grade data infrastructure.

Share Article

Related Articles

📰
No related articles found