Topics

Chinese open-source AI models (which repos deserve weekly checks)

Evergreen topic pages updated with new evidence

Last reviewed: 2026-06-04 · Policy: Editorial standards · Methodology

Decision in 20 seconds

If you want to track Chinese open-source AI models well, ground the routine in repo movement, model cards, and usable release paths. RadarAI is the first routing layer, but the actual decision comes from whether the repo, weights, docs, and license create a realistic path to testing.

Key points

  • DeepSeek has released multiple versions (e.g., DeepSeek-V4) emphasizing cost per token and domestic compute alignment.
  • QwenLM’s repositories show active development on video generation (e.g., HappyHorse 1.0 in gray testing) and multimodal tooling.
  • THUDM maintains foundational LLM and agent frameworks, though recent evidence of public releases is limited.

What changed recently

  • DeepSeek launched multimodal image understanding with sub-second latency (Apr 30), followed by a withdrawn 'Visual Primitive Thinking' framework (May 1).
  • Qwen’s HappyHorse 1.0 video model entered gray testing (Apr 28); no public release or GitHub update confirmed as of May 7.

Explanation

The evidence shows a consistent emphasis across Chinese open-source efforts on operational metrics—cost per token, latency, and deployment readiness—rather than raw benchmark gains.

However, documentation and release transparency vary: DeepSeek’s GitHub activity aligns closely with briefing claims, while THUDM’s recent updates are not directly corroborated in the evidence, and Qwen’s gray-testing status lacks public repository markers.

Tools / Examples

  • DeepSeek-V4 (GitHub: deepseek-ai/deepseek-vl) — focuses on efficiency and enterprise integration.
  • Qwen2-VL (GitHub: QwenLM/Qwen-VL) — supports vision-language tasks; updated Apr 2026 per repo history.

Evidence timeline

May 1 AI Briefing · Issue #252

A reinforcement learning reward shift triggered OpenAI's GPT-5.5 'Goblin Rebellion' incident, exposing a new risk to large-model behavioral controllability; meanwhile, DeepSeek achieved cost-effective outperformance over

AI Briefing, April 28 — Issue #244

OpenAI and Microsoft agree on multi-cloud decoupling to support IPO plans; Alibaba's HappyHorse 1.0 video generation model enters gray testing on Qwen; GitHub Copilot launches token-based AI credit billing.

May 7 AI Briefing · Issue #272

Generative AI is rapidly shifting from a 'model capability race' to a contest over infrastructure sovereignty and deep, scenario-specific deployment: cost per token has become the core metric in NVIDIA's redefined techni

May 4 AI Briefing · Issue #261

The release of DeepSeek-V4 marks AI's formal transition from consumer-facing traffic hype to a pragmatic phase focused on enterprise cost reduction, efficiency gains, and building a domestic computing ecosystem [14]; mea

May 3 AI Briefing · Issue #258

The AI industry is accelerating its shift from 'tool invocation' to 'embodied agents.' Codex's Computer Use capability and the open-source Clawd Cursor project mark a substantive breakthrough in AI's ability to operate g

AI Briefing, May 2 · Issue #257

DeepSeek rolls out multimodal image understanding in limited release; Apple confirms using Claude Code for its AI customer support system; RecursiveMAS introduces vector-level agent collaboration—outperforming top baseli

AI Briefing, May 2 · Issue #255

Multimodal reasoning and multi-agent collaboration are emerging as dual technical frontiers: DeepSeek open-sourced a vision-based reasoning framework to bridge spatial reference gaps; USTC and Huawei launched the 'Lingji

May 1 AI Briefing · Issue #254

DeepSeek unveiled its first visual reasoning capability, introducing the 'Visual Primitive Thinking' framework to bridge the multimodal referential gap—though its associated technical paper was swiftly withdrawn after re

AI Briefing, April 30 — Issue #251

GPT-5.5-Cyber launches for elite cybersecurity defenders; DeepSeek's image mode shows strong OCR and HTML reconstruction but flawed spatial reasoning; recursive multi-agent systems introduce latent-state direct transfer,

AI Briefing, April 30 — Issue #250

Multimodal capabilities and agent architecture design are emerging as new battlegrounds in AI infrastructure: DeepSeek launches full multimodal image understanding with sub-second latency; SenseNova-U1 achieves open-sour

Sources

FAQ

Why check these repos weekly?

Because infrastructure-relevant changes—like latency improvements, tokenizer updates, or multimodal support—are often merged without formal announcements, and impact deployment decisions directly.

Is THUDM’s recent work verified here?

No direct evidence of THUDM releases or GitHub updates appears in the briefings; their repos remain active but unmentioned in the May 2026 signals—so verification requires manual inspection.

Search angles this page supports

Related

Last updated: 2026-06-04 · Policy: Editorial standards · Methodology