At the AIGC Developers Conference 2026 in Hangzhou, Zhang Xiaobo, CEO of Hangzhou Cangjie Intelligent Technology, argued that the next phase of artificial intelligence will be won or lost on data. After two years of feverish attention on model size and multimodal capability, Zhang told an industry audience that developers and enterprises increasingly confront a blunt reality: models have advanced faster than the data systems needed to deploy them at scale in the real world.
Zhang drew a line between laboratory performance and industrial deployment, noting that in domains such as manufacturing, robotics, embodied intelligence and autonomous driving the limiting factor is not a lack of models but a lack of high‑quality, task‑structured data. For these applications models must be trained and continuously refined on data that faithfully reflects messy, dynamic physical environments, and that data must be engineered to be verifiable, reusable and producible on an ongoing basis.
Cangjie Intelligent’s answer is deliberate and infrastructural. Rather than racing for ever‑larger models, the company has prioritised building scalable data production systems: multimodal data engineering, complex‑scene dataset construction and task‑oriented annotation standards tailored to embodied and industrial use cases. The aim is to convert ephemeral labelled examples into reusable data assets that can feed training, inference and on‑line improvement over time.
That emphasis reflects a broader industry reassessment: as models and compute become commoditised, the competitive moat returns to data engineering. Valuable datasets are not merely large; they are structured around concrete tasks and operational standards, they can be measured and iterated upon, and they integrate into a closed loop with application telemetry and model retraining. When data is treated as an engineered asset rather than a one‑off cost, model performance gains become durable and cumulative.
Panel discussions at the conference homed in on four recurring questions: how firms can build their own data moat after the arrival of big models; what new data regimes verticals and embodied systems demand; how to reframe data from a cost centre into a strategic asset; and how to create a positive flywheel linking data, models and business outcomes. These conversations underline a practical shift in industrial AI strategy away from experimenting with generic models and toward investing in domain‑specific data pipelines and standards.
For international observers, the message matters because it reframes where capital, talent and partnerships will flow over the next phase of AI deployment. Expect increased investment in data‑ops tooling, dataset marketplaces, sensor networks and annotation platforms; more M&A activity around proprietary industrial datasets; and growing attention to governance, privacy and cross‑border data rules that will shape how these assets can be shared. If Zhang is right, the second half of the AI race will look less like a contest of model architectures and more like a competition to build durable, industrial‑grade data infrastructure.
