Top AI Updates of the Past Week: Large Models and Tools Roundup (First Week of February 2026)
Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.
Over the past seven days, the AI industry has once again entered a phase of rapid iteration. From optimizations in inference efficiency to open-sourcing of multimodal capabilities, from integration of programming agents to upgraded evaluation standards for commercial practicality—this week’s AI updates are not only dense but increasingly pragmatic. This article curates the 7 most impactful developments to help general readers quickly grasp key technological trends and application opportunities.
1. OpenAI GPT-5.2 Reduces Inference Latency by 40%
On February 4, OpenAI completed optimizations to the GPT-5.2 inference stack, reducing average API response latency by 40%. This improvement significantly enhances service stability and cost-efficiency under high-concurrency scenarios—especially beneficial for enterprise applications relying on real-time interaction. Internal testing shows server resource consumption drops by nearly one-third under identical workloads, strengthening support for large-scale deployment.
2. MiniCPM-o 4.5: The World’s First Open-Source Full-Duplex Multimodal Model
MiniMax’s newly released MiniCPM-o 4.5 is the world’s first open-source full-duplex multimodal large language model. With just 9 billion parameters, it outperforms GPT-4o on tasks such as image understanding and voice interaction. It supports real-time audio-video input, proactive notifications, and contextual memory—making it ideal for edge devices and on-premises deployment. For individual developers, this means building affordable, “see-and-speak” AI assistants is now within reach.
3. Claude Code Natively Integrated into Xcode, Launching Agent-Based Programming
Anthropic and Apple have deeply integrated Claude Code into Xcode 26.3. Developers can now invoke Claude directly inside the IDE to perform cross-project understanding, visual verification, and autonomous task execution. For example, simply describing “Fix the responsive layout of this login page” triggers Claude to automatically locate relevant files, modify code, and generate previews. This marks a pivotal shift—from “assisted code completion” to true “agent-driven execution” in programming.
4. Qwen3-Coder-Next Achieves Efficient Coding via MoE Architecture
The Tongyi Qwen team at Alibaba has launched Qwen3-Coder-Next, a sparse Mixture-of-Experts (MoE) model with only 3 billion active parameters. On the HumanEval benchmark, its code-generation capability matches that of leading closed-source models—yet its inference cost is just 1/11 of theirs. Released alongside the vLLM framework, the model supports one-click deployment from day one, dramatically lowering the barrier for enterprise private deployment.
5. ChatGPT Fully Supports the MCP Apps Standard
OpenAI announced that ChatGPT now fully supports the Model Context Protocol (MCP) Apps standard. This protocol enables different AI platforms to share contextual state, facilitating seamless cross-application collaboration. For example, a user could initiate an analytical task in Notion, prompting ChatGPT to invoke an MCP-connected database tool to perform computations—and then return the results. This move advances the AI application ecosystem toward standardized interoperability.
6. Gemini Surpasses 750 Million Monthly Active Users; Token Throughput Reaches 10 Billion Per Minute
Google officially confirmed that the Gemini family of models has exceeded 750 million monthly active users, with its API processing 10 billion tokens per minute. Jeff Dean stated this represents the largest real-time AI service workload globally today. This high-throughput capability underpins deep integrations across products like Gmail and Google Workspace—and signals that multimodal AI services are transitioning from “demo phase” to “daily use.”
7. Industry Evaluation Criteria Shift Toward “Commercial Practicality”
The authoritative research firm Artificial Analysis released Intelligence Index v4.0, shifting its evaluation focus from purely technical metrics to real-world commercial practicality. The new framework emphasizes task completion rate, cost-effectiveness, and user retention—rather than benchmark scores alone. This evolution steers developers toward asking, “What problems can it solve?” instead of “How large are its parameters?”
Tool Recommendation: How to Efficiently Track Weekly AI Updates?
With rapid, frequent updates, selecting reliable information sources is critical. The following tools help you save time and focus on what matters most:
| Purpose | Tool |
|---|---|
| Scan daily AI news, newly released models, and open-source projects | RadarAI, BestBlogs.dev |
| View model performance rankings and API usage statistics | OpenRouter, Hugging Face Leaderboard |
| Access developer-tested insights and hands-on tutorials | GitHub Trending, Juejin |
RadarAI aggregates high-quality AI updates from around the world and supports RSS subscriptions—ensuring you never miss any key developments ready for real-world deployment.
Related reading
- Top China-Built AI Models to Watch in 2026: DeepSeek, Qwen, Kimi & More
- China AI Updates in English: What Builders Should Watch Each Month
- How to Track China AI in English Without Doomscrolling
- Best English Sources for China AI Industry Updates (2026 Guide)
RadarAI helps builders track AI updates, compare source-backed signals, and decide which changes are worth acting on.