China AI API pricing and access changes: what builders should check weekly

2026-05-12 20:33

Author: fishbeta Editor: RadarAI Editorial Last updated: 2026-06-27 china-ai-api-pricing-access-fanout

Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.

China AI APIs change fast, and the changes that catch builders off guard are rarely the ones in the headlines. It's not the "DeepSeek just dropped a new model" announcement — you'll see that everywhere. It's the quiet quota reduction that takes your free tier from 500K tokens/day to 100K overnight. It's the billing system change that breaks your payment flow. It's the new geographic IP block that makes your staging environment suddenly return 403s. It's the pricing tier restructuring that doubles your cost while adding a new "enterprise" tier you didn't ask for. This page is a systematic guide to what builders should check weekly, per provider, to catch these changes before they catch you.

TL;DR: Each major Chinese AI API provider — DeepSeek, Qwen/Alibaba Cloud, Kimi/Moonshot, GLM/Zhipu, MiniMax, Baidu ERNIE — has its own pricing structure, free tier, regional access rules, and changelog cadence. A weekly 10-minute audit covering six provider-specific checkpoints can prevent most of the "surprise billing" and "API stopped working" incidents that plague builder workflows. This page gives you the checklist.

Who this is for

Builders actively using one or more Chinese AI APIs in production or staging environments
Teams monitoring Chinese AI APIs for cost optimization and budget planning
Product managers evaluating Chinese AI APIs for initial adoption decisions
DevOps engineers managing API integrations across multi-provider AI stacks

Who this is not for

Readers who want a general overview of Chinese AI models (go to the models list)
Teams not yet using Chinese AI APIs who want a general introduction (go to the China AI API access guide first)
Readers looking for performance benchmarks rather than access/pricing info

Decision in 20 seconds

If you just want to know current pricing: DeepSeek API pricing page is api.deepseek.com, Qwen is console.aliyun.com (Model Studio section), Kimi is platform.moonshot.cn, GLM is open.bigmodel.cn, MiniMax is minimax.io/platform, ERNIE is qianfan.cloud.baidu.com. Check the pricing page directly — this guide explains what to look for and why those numbers change.

Why API changes catch builders off guard

Three structural factors make Chinese AI API changes particularly disruptive for English-speaking builders:

1. Changelog localization lag: Most Chinese AI API changelogs are published in Chinese first. The English documentation update follows hours to days later. For pricing changes specifically, the English documentation can lag 48–72 hours behind the actual implementation. If you're checking only English docs, you may miss a pricing change that's already live.

2. Free tier volatility: Chinese AI providers have used aggressive free tier promotions as adoption growth tools — then adjusted them as usage scaled. DeepSeek reduced free API quota multiple times in 2025. Qwen's Alibaba Cloud free tier structure changed twice in Q1 2026. Treating a free tier as stable is a liability.

3. Regional access complexity: The geographic availability of Chinese AI APIs is more nuanced than simple China/non-China splits. Some providers have regional differences in quota allocation, feature availability, and even model versioning between their domestic and international API surfaces. A test that passes from a US IP may behave differently from a Southeast Asian IP.

Per-provider weekly check guide

DeepSeek

Pricing page: api.deepseek.com/pricing (Chinese and English, usually in sync within 24h)

Current pricing (as of Q1 2026): DeepSeek-V3 at $0.14/M input tokens, $0.28/M output tokens (cache hit reduces to ~$0.014/M input). DeepSeek-R1 at $0.55/M input, $2.19/M output. These are among the lowest prices in the global AI API market for comparable capability.

Free tier: DeepSeek provides a free tier with daily rate limits. The specific limits have changed multiple times — check the current dashboard for live quota. Historically the free tier has been 50–500K tokens/day depending on model and time period.

Regional restrictions: DeepSeek API access has had documented disruptions from certain IP ranges. Access from mainland Chinese IPs to the international API endpoint has been inconsistent. Most global builders use the international API at api.deepseek.com directly. Users in some regions have reported needing VPN for reliable access, though official policy doesn't explicitly restrict by geography.

Notable 2026 changes: In January 2026, DeepSeek implemented cache pricing — reducing cost dramatically for repeated similar prefixes. This is significant for RAG workloads where system prompts are reused.

Weekly check: Log in to the API dashboard, check quota usage against last week, check the pricing page for any tier changes, and verify API latency from your primary deployment region.

Qwen / Alibaba Cloud Model Studio

Pricing page: modelscope.cn/models or directly via console.aliyun.com → Model Studio → Pricing

Current pricing (Q1 2026): Qwen-Max at ~¥0.12/K tokens (~$0.017/K tokens); Qwen-Plus at ~¥0.008/K tokens (~$0.001/K tokens); Qwen-Long (1M context) at ~¥0.0005/K tokens for cached input. Prices in CNY, with international billing converting at market rate.

Free tier: Alibaba Cloud provides a startup/free tier with 1M tokens/month for Qwen-Turbo as of Q1 2026. Free tier access requires creating an Alibaba Cloud account (international accounts supported but require phone verification).

Regional restrictions: Qwen API is officially available globally. However, the Chinese domestic endpoint (modelstudio.alibabacloud.com) and the international endpoint (dashscope-intl.aliyuncs.com) have separate authentication and different model availability windows — new models sometimes appear on the domestic endpoint weeks before the international endpoint. This is the most significant regional access issue for Qwen.

Notable 2026 changes: Qwen3 (April 2026) introduced thinking mode across the model family. The API parameter for enabling/disabling thinking mode (enable_thinking) was added to the API in the same release, but documentation in English lagged ~72 hours.

Weekly check: Check the Model Studio changelog in Alibaba Cloud console, verify which models are available on your account tier (domestic vs. international endpoint), check CNY/USD rate impact on your effective token cost.

Kimi / Moonshot AI

Pricing page: platform.moonshot.cn/pricing

Current pricing (Q1 2026): Moonshot-v1-8k at ¥0.012/K tokens; Moonshot-v1-32k at ¥0.024/K tokens; Moonshot-v1-128k at ¥0.06/K tokens. Long-context pricing is competitive — 128K context at this price point is notable.

Free tier: Platform.moonshot.cn provides initial credits on signup (historically CNY 15–50). No ongoing free tier for production use — it's credit-based with mandatory top-up.

Regional restrictions: Moonshot/Kimi API has the most significant access friction for international builders among the major Chinese providers. The platform requires a Chinese phone number for full account creation as of early 2026. International users have reported success using virtual phone numbers, but this is not officially supported. The API itself (once credentials are established) is accessible globally without geographic restrictions.

Notable 2026 changes: Kimi k1.5 (the long-context reasoning model announced in early 2025) API access moved from waitlist to general availability in Q4 2025. Context window support was extended.

Weekly check: Check the platform announcement page (platform.moonshot.cn/announcements), verify API credit balance, check if any new context window options have been added.

GLM / Zhipu AI BigModel

Pricing page: open.bigmodel.cn/pricing

Current pricing (Q1 2026): GLM-4-Plus at ¥0.05/K tokens; GLM-4-Flash (fastest, cost-optimized) at ¥0.001/K tokens; GLM-4-Long (1M context) at ¥0.001/K tokens for input. GLM-4-Flash is among the cheapest Chinese AI APIs per token.

Free tier: Zhipu AI provides a meaningful free tier — GLM-4-Flash has a free quota that resets monthly. As of early 2026, the free tier covers approximately 1M tokens/month on GLM-4-Flash.

Regional restrictions: GLM API (open.bigmodel.cn) is accessible globally. No documented geographic restrictions. International billing via PayPal and some credit cards is supported. This is one of the more accessible Chinese AI APIs for international builders.

Notable 2026 changes: CogVideoX API access was added to BigModel in late 2025, allowing programmatic video generation via the same credential system as text models. Agent tool-calling API was updated to support more complex function schemas.

Weekly check: Check open.bigmodel.cn changelog, verify free tier quota reset, check if CogVideoX pricing has changed (it has been updated multiple times as the model improved).

MiniMax

Pricing page: minimax.io/platform or developer console at api.minimax.chat

Current pricing (Q1 2026): MiniMax-Text-01 (1M context) at $0.015/K input tokens, $0.075/K output tokens. Audio generation pricing is separate — check the platform for current rates. Video generation (Hailuo) is credit-based with a different pricing structure.

Free tier: MiniMax provides starter credits on international account creation. Free tier volume is limited — production workloads require a paid plan.

Regional restrictions: MiniMax operates an international API specifically for non-Chinese users at minimax.io. The domestic Chinese API (api.minimax.chat) and the international API use separate endpoints and have different model availability. International users should use minimax.io exclusively.

Notable 2026 changes: MiniMax-Text-01 replaced the previous MiniMax-01 as the flagship API model. The context window was confirmed at 1M tokens with both text and audio input support.

Weekly check: Check minimax.io/news for product updates, verify international API endpoint stability, check pricing for audio/video generation separately from text (they update at different cadences).

Baidu ERNIE / Qianfan

Pricing page: qianfan.cloud.baidu.com/pricing

Current pricing (Q1 2026): ERNIE 4.0 Turbo at ¥0.12/K tokens; ERNIE Speed at ¥0.004/K tokens (heavily discounted); ERNIE Lite at ¥0.003/K tokens. Significant free tiers for some models.

Free tier: Baidu Qianfan provides substantial free tiers on some models — ERNIE Lite and ERNIE Speed have had unlimited free quotas with rate limits at various points in 2025-2026. This changes frequently.

Regional restrictions: Qianfan access for international users requires Chinese business entity verification for enterprise tiers. Personal developer accounts with international credentials are possible but have higher friction than most other providers. API access itself (once credentialed) works globally.

Notable 2026 changes: Baidu restructured Qianfan pricing in Q4 2025, introducing a new "ERNIE Speed" tier with dramatically reduced pricing. Free tier quotas for promotional models have fluctuated multiple times.

Weekly check: Check qianfan.cloud.baidu.com/doc/release-notes for model and pricing updates, verify free tier quota availability, check if any new model tiers have been added.

Provider comparison table

Provider	Pricing Page	Free Tier	Regional Restrictions	Notable 2026 Changes
DeepSeek	api.deepseek.com	Daily rate limit (variable)	Some IP disruptions reported	Cache pricing introduced Jan 2026
Qwen/Alibaba	console.aliyun.com Model Studio	1M tokens/month (Qwen-Turbo)	Domestic vs. international endpoint split; new models appear on domestic first	Qwen3 thinking mode added April 2026
Kimi/Moonshot	platform.moonshot.cn	Signup credits only	Chinese phone required for full account; API globally accessible	k1.5 API to GA in Q4 2025
GLM/Zhipu	open.bigmodel.cn	~1M tokens/month (GLM-4-Flash)	Global access, no geo restrictions	CogVideoX API added, agent tool calling updated
MiniMax	minimax.io	Starter credits	International vs. domestic endpoint split	MiniMax-Text-01 1M context confirmed
Baidu ERNIE	qianfan.cloud.baidu.com	ERNIE Speed/Lite free tier (fluctuates)	Enterprise tier needs CN business entity	ERNIE Speed tier added Q4 2025

What "access" means for global builders

Access is not binary — it's a spectrum with four distinct levels:

Level 1 — Fully accessible: No geographic restrictions, international credit card billing, English documentation maintained in sync with Chinese. GLM/Zhipu and DeepSeek come closest to this level currently.

Level 2 — Accessible with friction: International accounts work but require workarounds (virtual phone numbers, account age minimums, or registration flow quirks). Qwen/Alibaba Cloud and MiniMax fall here for most international builders.

Level 3 — Restricted access: Account creation requires Chinese credentials for full access, but the API itself works globally once authenticated. Kimi/Moonshot is the main example — phone number verification is the primary barrier.

Level 4 — Restricted deployment: Enterprise tiers require Chinese business entity verification. ERNIE/Baidu falls here for full enterprise access. API developer access is possible but limited.

For builders choosing between providers, the level of access friction should be factored into total cost of adoption — not just per-token pricing.

The weekly 10-minute API audit routine

This routine takes 10 minutes and prevents most billing surprises and access incidents:

Minutes 1–3: Check your API dashboard for each active provider. Look at: daily usage vs. quota, any warnings or alerts, and billing/credit balance. Flag any unexpected usage spikes.

Minutes 4–6: Scan the changelog or release notes for each active provider (most have a dedicated page — check the URLs listed above). Specifically look for: pricing tier changes, free tier modifications, new model versions being set as default, deprecated model versions.

Minutes 7–8: Run a quick smoke test API call from your staging environment. Verify that your primary model, primary API region, and authentication method are all working. This catches regional access issues before they become incidents.

Minutes 9–10: Check the DeepSeek, Qwen, and any other relevant labs' X accounts (@deepseek_ai, @qwen_lm, etc.) for any major announcements. Lab X accounts often post about API changes 24–48 hours before the official documentation update.

Decision tree: when to migrate vs. stay on a China AI API

Run through this decision tree quarterly:

Stay if: Current pricing is within budget, access is reliable from your deployment regions, the model family continues to perform on your core use cases, and the license remains compatible with your deployment.

Evaluate alternatives if: Free tier has been reduced twice in 6 months (signal of monetization pressure), access from one or more of your deployment regions has had more than 2 disruptions in a quarter, or a significantly cheaper/capable alternative has emerged that you haven't benchmarked.

Migrate if: Pricing increased more than 30% without capability improvement, geographic access to your primary deployment region is unreliable, license terms changed in a way that conflicts with your commercial deployment, or the model family is being deprecated with no maintained successor.

FAQ

How much does DeepSeek API cost? As of Q1 2026: DeepSeek-V3 at $0.14/M input tokens and $0.28/M output tokens (full price), with cache hit pricing reducing input cost to approximately $0.014/M tokens. DeepSeek-R1 is $0.55/M input and $2.19/M output. These prices are significantly lower than comparable US model APIs — GPT-4o at $2.50/M input is approximately 18x more expensive than DeepSeek-V3 at similar capability levels on most benchmarks.

Is Qwen API available globally? Yes, with caveats. Alibaba Cloud Model Studio has an international endpoint (dashscope-intl.aliyuncs.com) accessible globally. New Qwen models typically appear on the domestic endpoint (modelstudio.alibabacloud.com) first — sometimes with a 2-4 week lag before international availability. Account creation requires Alibaba Cloud registration, which works with international email but requires phone verification.

How to access Kimi API outside China? The Kimi/Moonshot platform (platform.moonshot.cn) requires Chinese phone number verification for full account creation. For international builders, common approaches are: (1) use a virtual phone number service for the signup step (not officially supported but widely used), (2) access via third-party API aggregators that have Kimi access, or (3) work with a Chinese partner who can set up the account. Once credentials are established, the API itself is accessible globally.

What are GLM API pricing tiers? Zhipu AI BigModel (open.bigmodel.cn) has multiple tiers as of Q1 2026: GLM-4-Flash (fastest, cost-optimized) at ¥0.001/K tokens with free monthly quota; GLM-4-Plus (highest capability) at ¥0.05/K tokens; GLM-4-Long (1M context) at ¥0.001/K tokens input. GLM-4-Flash's pricing makes it one of the cheapest Chinese AI API options for high-volume text generation.

Do Chinese AI APIs work from AWS or GCP? Yes, for most providers. DeepSeek, GLM, and MiniMax APIs work from cloud egress IPs in US/EU/APAC regions without documented restrictions. Qwen/Alibaba Cloud works from non-Alibaba Cloud IPs. The main exception is DeepSeek, which has had periodic disruptions from certain IP ranges — test from your specific cloud region rather than assuming general availability.

How do Chinese AI API prices compare to OpenAI? As of Q1 2026, Chinese AI APIs are dramatically cheaper per token for comparable capability: DeepSeek-V3 ($0.14/M input) vs GPT-4o ($2.50/M input) — roughly 18x difference. Qwen-Plus (~$0.001/K tokens = $1/M) vs GPT-4o-mini ($0.15/M) — roughly 6x cheaper. The gap is smaller for the cheapest tiers (GPT-4o-mini is competitive with Qwen-Turbo for English tasks) and larger for the most capable tiers.

What happens if a Chinese AI API goes down? Chinese AI APIs don't have AWS-equivalent SLA guarantees at the free/developer tier. Documented incidents: DeepSeek API had 6+ hours of degraded service in January 2025 after viral international adoption. Moonshot/Kimi had rate limit enforcement changes that caused sudden 429 errors without advance notice. Best practices: implement exponential backoff and retry logic, maintain a fallback provider (e.g., if primary is DeepSeek, have Qwen configured as backup), and monitor API latency as a leading indicator.

Evidence and source links

DeepSeek API pricing: api.deepseek.com — pricing page with full token costs
Alibaba Cloud Model Studio: console.aliyun.com — Qwen pricing and tier documentation
Moonshot AI platform: platform.moonshot.cn — Kimi API pricing
Zhipu BigModel: open.bigmodel.cn — GLM pricing and free tier documentation
MiniMax developer: minimax.io — international API access
Baidu Qianfan: qianfan.cloud.baidu.com — ERNIE pricing documentation

Companion pages in this cluster

China AI API access guide: /en/china-ai-api-access-guide — detailed guide for accessing each Chinese AI API from outside China
China foundation model companies: /en/topics/china-foundation-model-companies — full company profiles
China AI models list: /en/china-ai-models-list — current model watchlist
China AI in 2026 overview: /en/china-ai-in-2026 — complete landscape reference