Top AI Updates of the Past Week: Large Models and Tools Roundup (First Week of February 2026)

2026-03-04 11:18

Author: fishbeta Editor: RadarAI Editorial Last updated: 2026-05-11 AI Weekly Update Large Model Updates AI Tools Briefing GPT-5.2 MiniCPM-o 4.5 Qwen3 Claude Code

Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.

Over the past seven days, the AI industry has once again entered a phase of rapid iteration. From optimizations in inference efficiency to open-sourcing of multimodal capabilities, from integration of programming agents to upgraded evaluation standards for commercial practicality—this week’s AI updates are not only dense but increasingly pragmatic. This article curates the 7 most impactful developments to help general readers quickly grasp key technological trends and application opportunities.

1. OpenAI GPT-5.2 Reduces Inference Latency by 40%

On February 4, OpenAI completed optimizations to the GPT-5.2 inference stack, reducing average API response latency by 40%. This improvement significantly enhances service stability and cost-efficiency under high-concurrency scenarios—especially beneficial for enterprise applications relying on real-time interaction. Internal testing shows server resource consumption drops by nearly one-third under identical workloads, strengthening support for large-scale deployment.

2. MiniCPM-o 4.5: The World’s First Open-Source Full-Duplex Multimodal Model

MiniMax’s newly released MiniCPM-o 4.5 is the world’s first open-source full-duplex multimodal large language model. With just 9 billion parameters, it outperforms GPT-4o on tasks such as image understanding and voice interaction. It supports real-time audio-video input, proactive notifications, and contextual memory—making it ideal for edge devices and on-premises deployment. For individual developers, this means building affordable, “see-and-speak” AI assistants is now within reach.

3. Claude Code Natively Integrated into Xcode, Launching Agent-Based Programming

Anthropic and Apple have deeply integrated Claude Code into Xcode 26.3. Developers can now invoke Claude directly inside the IDE to perform cross-project understanding, visual verification, and autonomous task execution. For example, simply describing “Fix the responsive layout of this login page” triggers Claude to automatically locate relevant files, modify code, and generate previews. This marks a pivotal shift—from “assisted code completion” to true “agent-driven execution” in programming.

4. Qwen3-Coder-Next Achieves Efficient Coding via MoE Architecture

The Tongyi Qwen team at Alibaba has launched Qwen3-Coder-Next, a sparse Mixture-of-Experts (MoE) model with only 3 billion active parameters. On the HumanEval benchmark, its code-generation capability matches that of leading closed-source models—yet its inference cost is just 1/11 of theirs. Released alongside the vLLM framework, the model supports one-click deployment from day one, dramatically lowering the barrier for enterprise private deployment.

5. ChatGPT Fully Supports the MCP Apps Standard

OpenAI announced that ChatGPT now fully supports the Model Context Protocol (MCP) Apps standard. This protocol enables different AI platforms to share contextual state, facilitating seamless cross-application collaboration. For example, a user could initiate an analytical task in Notion, prompting ChatGPT to invoke an MCP-connected database tool to perform computations—and then return the results. This move advances the AI application ecosystem toward standardized interoperability.

6. Gemini Surpasses 750 Million Monthly Active Users; Token Throughput Reaches 10 Billion Per Minute

Google officially confirmed that the Gemini family of models has exceeded 750 million monthly active users, with its API processing 10 billion tokens per minute. Jeff Dean stated this represents the largest real-time AI service workload globally today. This high-throughput capability underpins deep integrations across products like Gmail and Google Workspace—and signals that multimodal AI services are transitioning from “demo phase” to “daily use.”

7. Industry Evaluation Criteria Shift Toward “Commercial Practicality”

The authoritative research firm Artificial Analysis released Intelligence Index v4.0, shifting its evaluation focus from purely technical metrics to real-world commercial practicality. The new framework emphasizes task completion rate, cost-effectiveness, and user retention—rather than benchmark scores alone. This evolution steers developers toward asking, “What problems can it solve?” instead of “How large are its parameters?”

Tool Recommendation: How to Efficiently Track Weekly AI Updates?

With rapid, frequent updates, selecting reliable information sources is critical. The following tools help you save time and focus on what matters most:

Purpose	Tool
Scan daily AI news, newly released models, and open-source projects	RadarAI, BestBlogs.dev
View model performance rankings and API usage statistics	OpenRouter, Hugging Face Leaderboard
Access developer-tested insights and hands-on tutorials	GitHub Trending, Juejin

RadarAI aggregates high-quality AI updates from around the world and supports RSS subscriptions—ensuring you never miss any key developments ready for real-world deployment.