← Back to Library

Chinese AI in 2025, Wrapped

Deep Dives

Explore related topics with these Wikipedia articles, rewritten for enjoyable reading:

  • Mixture of experts 12 min read

    The article repeatedly references DeepSeek's Mixture-of-Experts (MoE) architecture as a key innovation enabling cost-efficient training. Understanding this machine learning technique would help readers grasp why this architectural choice was significant for Chinese AI development under compute constraints.

  • Export control 14 min read

    The US-China chip war and export restrictions on Nvidia H20s/H200s are central to the article's narrative about Chinese AI development. Understanding the history and mechanisms of export controls provides essential context for the ongoing technology rivalry.

A year for the history books for the Chinese AI beat. We began the year astonished by DeepSeek’s frontier model, and are ending in December with Chinese open models like Qwen powering Silicon Valley’s startup gold rush.

It’s a good time to stop and reflect on Chinese AI milestones throughout 2025. What really mattered, and what turned out to be nothingburgers?

This piece recaps:

  • The biggest model drops of the year

  • China’s evolving AGI discussion among Alibaba leadership and the Politburo

  • The biggest swings in the US-China chip war

  • Beijing’s answer to America’s AI Action plan and the MFA’s

  • Robots

Models

The DeepSeek Moment

Liang Wenfeng lit the fire

DeepSeek-R1 came out on January 20, thwarting everyone’s Chinese New Year plans. The cost-efficient LLM, which uses a Mixture-of-Experts (MoE) architecture, caused many in Silicon Valley to re-evaluate their bets on scaling — and on unfettered American dominance in frontier models. DeepSeek is powered by domestically trained Chinese engineering talent, an apparent belief in AGI, and no-strings-attached hedge fund money (it is owned by High-Flyer 幻方量化, a Hangzhou-based quantitative trading firm). There were initial concerns that such a recipe could not be replicated by more capital-constrained Chinese tech startups, but Kimi proved that wrong with K2 in July; Z.ai, Qwen, and MiniMax followed.

We translated Chinese tech media 36Kr’s interview with DeepSeek CEO Liang Wenfeng back in November 2024, and spent much of January 2025 on the DeepSeek beat (see Jordan’s conversations on DeepSeek with Miles Brundage here and with Kevin Xu of Interconnected here). Over at the newsletter, we covered how China reacted to DeepSeek’s rise, its secret sauce, and concerns around open-source as a strategy.

DeepSeek continues to be a big deal. For one, it paved the way for an open-source race dominated by Chinese models. Nearly every notable model released by Chinese companies in 2025 has been open source. In public blog posts, social media discussions, and private conversations, Chinese engineers and tech executives repeatedly attribute their open-source orientation to the example set by DeepSeek.

On the technical end, despite some remaining mystery surrounding the exact cost of training R1, DeepSeek’s viability was a shot in the arm for Chinese labs working under compute constraints. Going into 2026, with restrictions on H200s loosened and reporting that DeepSeek is still training on smuggled Nvidia, easier access to TSMC-fabbed Nvidia chips may be just what DeepSeek needs ...

Read full article on ChinaTalk →