Thesis
China's foundation models reached GPT-4o parity in 2026 — and the open-source gap is closed. Qwen3-235B-A22B (Alibaba, April 2026, Apache 2.0 MoE) matches GPT-4o on MMLU (88.7) and exceeds it on MATH (Qwen3 79.4 vs GPT-4o 76.6). DeepSeek-V3 (MIT, December 2024, updated H1 2025) leads on code at sub-GPT-4o cost. The key differentiator for global builders is not capability anymore — it is the open license. Apache 2.0 and MIT mean you can fine-tune, self-deploy, and redistribute derivatives commercially without negotiating enterprise agreements. Models like ERNIE 4.0 and Kimi k1.5 fill specific niches (enterprise Chinese NLP and long-context respectively) but are less universally accessible.
Decision in 20 seconds
| Use case | Recommended model | Why |
|---|---|---|
| Commercial fine-tuning (open license) | Qwen3-7B / Qwen3-14B (Apache 2.0) | Broadest tooling support; explicit commercial fine-tuning rights; 0.6B–235B family consistency |
| RAG over multilingual documents | Qwen3-32B or Qwen3-30B-A3B MoE | Best multilingual retrieval-augmented performance in the open-weight class |
| Long-document processing (100K+ tokens) | Kimi k1.5 (API via platform.moonshot.cn) | 128K context native; competitive on long-context RULER benchmark |
| Code generation and STEM reasoning | DeepSeek-V3 (MIT) | HumanEval 84.2; MATH 81.6; MIT license; Together AI / Fireworks hosting available |
| Enterprise Chinese NLP (on-prem required) | ERNIE 4.0 (Baidu Cloud) or GLM-4 (open.bigmodel.cn) | Best Chinese-language domain adaptation; ERNIE has strongest Baidu ecosystem integration |
China AI foundation models — full comparison (2026)
| Model | Release date | Parameters | License | Benchmark highlights | API access |
|---|---|---|---|---|---|
| Qwen3-235B-A22B (Alibaba) | April 2026 | 235B total / 22B active (MoE) | Apache 2.0 | MMLU 88.7, MATH 79.4, HumanEval 82.6 | dashscope.aliyun.com; AWS Bedrock; HuggingFace weights |
| DeepSeek-V3 (DeepSeek) | December 2024 (updated H1 2025) | 671B total / 37B active (MoE) | MIT | MMLU 87.1, MATH 81.6, HumanEval 84.2 | platform.deepseek.com; Together AI; Fireworks AI; Amazon Bedrock |
| Kimi k1.5 (Moonshot AI) | January 2025 | Undisclosed | Proprietary API | MMLU 85.4, MATH 77.3; RULER long-context 86.1 (128K) | platform.moonshot.cn (international payment accepted) |
| GLM-4 (Zhipu AI) | January 2024 (GLM-4-Plus: August 2024) | ~130B (estimated) | Apache 2.0 (GLM-4-9B); proprietary for full GLM-4 | MMLU 83.6, C-Eval 77.2 (strong Chinese benchmark) | open.bigmodel.cn; HuggingFace (GLM-4-9B weights) |
| ERNIE 4.0 Turbo (Baidu) | October 2024 | Undisclosed | Proprietary (Baidu Cloud only) | CMMLU 87.3 (Chinese-language SOTA); enterprise SLA available | Baidu Wenxin Workshop; Baidu Cloud enterprise (limited international access) |
| MiniMax-Text-01 (MiniMax) | January 2025 | 456B total / 45.9B active (MoE) | MIT (weights on HuggingFace) | MMLU 88.5, MATH 77.8; 1M context window | minimax.io (international tier); HuggingFace weights |
API access methods — international builder guide
| Access method | Models available | Notes | Best for |
|---|---|---|---|
| Direct API (lab) | Qwen (dashscope.aliyun.com), DeepSeek (platform.deepseek.com), Kimi (platform.moonshot.cn), MiniMax (minimax.io) | OpenAI-compatible /v1/chat/completions on all four; Stripe/international card accepted on DeepSeek, Kimi, MiniMax | Lowest latency; most up-to-date model versions |
| HuggingFace (weights) | Qwen3 (all sizes), DeepSeek-V3/R1, GLM-4-9B, MiniMax-Text-01 | Free download; Apache 2.0 / MIT; no rate limits once downloaded; requires GPU infrastructure | Self-hosted fine-tuning, private deployment, offline inference |
| International cloud hosting | Qwen3 (AWS Bedrock, Cloudflare Workers AI), DeepSeek (Together AI, Fireworks AI, Amazon Bedrock), GLM-4 (Replicate) | US-billing, no China account required; Together AI and Fireworks add <50ms latency overhead; pricing competitive with direct API | Teams without China payment methods or needing US data residency |
| Self-deployment (vLLM / Ollama) | Qwen3-0.6B to Qwen3-32B; DeepSeek-R1-Distill-Qwen-7B; GLM-4-9B | vLLM 0.4+ supports Qwen3 natively; Ollama library has quantized versions for M-series Mac; A100 40GB handles Qwen3-30B-A3B MoE at full precision | Air-gapped environments, on-prem enterprise, zero API cost at scale |
FAQ
- What are the best China AI foundation models in 2026?
- Qwen3-235B-A22B (Apache 2.0, MMLU 88.7) for open-weight SOTA; DeepSeek-V3 (MIT, HumanEval 84.2) for code and reasoning; Kimi k1.5 for 128K long-context; GLM-4 for bilingual tasks; ERNIE 4.0 for enterprise Chinese NLP; MiniMax-Text-01 for 1M-context and multimodal.
- How does Qwen compare to DeepSeek for commercial use?
- Both are commercially usable under open licenses. Qwen3 (Apache 2.0) leads on multilingual and instruction following; DeepSeek-V3 (MIT) leads on code (HumanEval 84.2) and MATH (81.6). For fine-tuning pipelines: Qwen3 has broader tooling coverage. For code-heavy workloads: DeepSeek-V3.
- Which Chinese foundation models are available outside China?
- Qwen3 and DeepSeek-V3/R1 via HuggingFace globally. Kimi API (platform.moonshot.cn) and MiniMax API (minimax.io) accept international payment. ERNIE 4.0 is primarily China-gated via Baidu Cloud. GLM-4-9B open weights are available; full GLM-4 API is at open.bigmodel.cn with international tier.
- What is the best open-source Chinese LLM for fine-tuning?
- Qwen3-7B and Qwen3-14B — Apache 2.0 is explicit on commercial fine-tuning rights, tooling (LLaMA-Factory, Axolotl, Unsloth, vLLM) is comprehensive, and the 0.6B–235B family shares a consistent architecture and tokenizer. DeepSeek-R1-Distill-Qwen-7B is an alternative for reasoning-heavy tasks.
- How do Chinese foundation models compare to GPT-4o on benchmarks?
- Qwen3-235B-A22B ties GPT-4o on MMLU (88.7) and beats it on MATH (79.4 vs 76.6). DeepSeek-V3 leads GPT-4o on MATH (81.6 vs 76.6). GPT-4o still leads on code (HumanEval 90.2 vs DeepSeek 84.2). At sub-30B open-weight models, Qwen3-30B-A3B MoE is the best GPT-4o approximation for private deployment.
- Where can I access Kimi, Qwen, and DeepSeek APIs in English?
- Kimi: platform.moonshot.cn (English docs, international card). Qwen: dashscope.aliyun.com or AWS Bedrock (no China account needed). DeepSeek: platform.deepseek.com or Together AI / Fireworks AI (US billing, OpenAI-compatible endpoints on all three).
Companion pages in this cluster
| If your question is about… | Go to | What's there |
|---|---|---|
| Which Chinese models are open-source and commercially usable | China AI Open Source Models | Full license comparison table — Apache 2.0, MIT, custom; download links and commercial use restrictions |
| Recent model releases and capability milestones | Model Release Tracker | Qwen3, DeepSeek, Kimi release timeline with benchmark data and source verification |
| Weekly digest of what changed in China AI | China AI Updates | Weekly signal digest — model releases, funding, policy, curated for builders |
| Which China AI companies to watch | Best China AI Companies | 15-company shortlist: foundation model labs, workflow infra, physical AI — with monitoring frequencies |
| China AI startup funding rounds and IPOs | China AI Startup Funding Tracker | Moonshot $1B Series B, MiniMax $600M, WeRide NYSE IPO — timeline with sources |
Quotable summary: In 2026, the correct question is no longer "can Chinese foundation models match GPT-4o?" — they can, on most benchmarks. The question is which open license fits your deployment, which API fits your payment infrastructure, and whether you need on-prem weights or managed inference. Qwen3 and DeepSeek-V3 answer both license and capability simultaneously; Kimi fills the long-context gap; ERNIE and GLM-4 serve enterprise Chinese NLP. Pick by use case, not by country of origin.