Weekly AI Highlights · May 29, 2026

2026-05-29 09:00

Author: RadarAI Editorial Editor: RadarAI Editorial Last updated: 2026-07-14 Review status: Editorial review pending Weekly report 周报官方 AI热点

The new Claude Code `/usage` command launches—marking the first production-grade, token-level granular tracking of consumption across four agent capability types: Skills, Agents, MCPs, and Plugins—ushering AI engineering into the era of 'measurable cost'.

Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.

## Weekly Overview - The new Claude Code `/usage` command launches—marking the first production-grade, token-level granular tracking of consumption across four agent capability types: Skills, Agents, MCPs, and Plugins—ushering AI engineering into the era of 'measurable cost'. - DeepSeek V4-Pro API permanently drops to 25% of its original price; Xiaomi's MiMo-V2.5 slashes inference cost to ¥0.025 per million tokens—launching China's domestic large model 'token price war', where inference efficiency has become a product's make-or-break factor. - Baibu Intelligence releases **ForgeTrain**, the world's first production-grade pretraining framework *written entirely by AI*, and uses it to train the MiniCPM5-1B model—signaling that 'AI building AI' has matured from conceptual validation to full engineering closure. - Huawei proposes a new chip evolution paradigm—the 'Tao Law'—replacing transistor density with time constant τ as the core metric, and redefining compute advancement through logic folding and 3D stacking—challenging the foundational logic of Moore's Law. - Anthropic open-sources the **Cybersecurity Skills** project (754 structured security skills), delivering the first executable, verifiable, and composable real-world cybersecurity knowledge base for AI agents—shifting agent capability from 'can do' to 'should do'. - OPPO ColorOS 16 and vivo's official AI shopping assistant jointly validate the path toward 'invisible intelligence': lightweight intent-recognition models + agent workflows + RAG knowledge bases—confirming that the ultimate form of mobile AI has transcended explicit interaction. ## Hot Topics List 1. **Claude Code adds `/usage` command** for detailed token-consumption reporting across Skills, Agents, MCPs, and Plugins https://www.bestblogs.dev/status/2057584283448205353?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: This is the first production-grade token metering tool designed explicitly for multi-agent architectures—not single API calls. It transforms abstract capabilities like Skills into auditable, attributable, and optimizable cost units—directly enabling enterprise-grade ROI calculation and budget governance for agent deployments. — *Potential applications*: Individual developers can immediately run `/usage` in their local Codex environment to inspect token distribution across Skills in the current session; product teams can leverage this data to design SaaS pricing models—e.g., packaging the 'code review Skill' as a standalone paid plugin and auto-generating customer invoices using `/usage` output. 2. **DeepSeek V4-Pro API permanently reduced to 25% of original price; launches 'Harness' engineering initiative to match Claude Code** https://www.bestblogs.dev/article/5e68673c?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: This price cut isn't a promotional tactic—it's a strategic lever to force deep engineering re-architecture. Optimized specifically for agent workloads, V4-Pro—paired with the Harness initiative—aims to build a Chinese-native Claude Code toolchain, shifting the battleground for programming agents from raw model parameters to system-level usability. — *Potential applications*: Developers can instantly compare V4-Pro against original V4 via `curl`, measuring latency and token usage under `/goal`-driven multi-task scenarios; product teams can rapidly deploy domain-specific CLI agents (e.g., legal contract review) leveraging V4-Pro's low-cost, high-concurrency strengths—and monitor progress in real time using `/side`. 3. **Baibu Intelligence releases ForgeTrain**: the world's first production-grade pretraining framework *authored by AI*, used to train MiniCPM5-1B https://www.bestblogs.dev/article/1ac2cf11?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: AI is no longer just the *output* of training—it's becoming the *builder* of infrastructure. ForgeTrain automatically generates distributed training logic, gradient synchronization strategies, and fault-recovery mechanisms—achieving 1B-model training efficiency surpassing Megatron, thereby validating the feasibility of 'AI infrastructure bootstrapping'. — *Potential applications*: Individual developers can clone the ForgeTrain repo and reproduce MiniCPM5-1B's lightweight training on Colab using a single A10G GPU; product teams can package ForgeTrain as a 'Model Factory' SaaS—enabling customers to upload proprietary data and generate custom small models with one click. 4. **Huawei formally introduces the 'Tao Law'**, redefining chip evolution around time constant τ https://www.bestblogs.dev/article/083617dd?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: Abandoning the transistor-scaling path, Huawei instead optimizes the four-dimensional trade-off among performance, power, area, and latency via logic folding, 3D stacking, and heterogeneous integration. τ emerges as a measurable, comparable new benchmark for chip competitiveness—offering China a non-size-dependent breakthrough path toward sovereign compute. — *Potential applications*: Hardware engineers can use Huawei's Ascend SDK to measure τ (e.g., LLM inference latency per watt) across chips and plot Pareto frontiers of 'τ vs. accuracy'; product teams can redefine device specs around τ—e.g., marketing next-gen flagship phones with 'LLM response τ < 800ms on-device' as a core selling point. 5. **Anthropic open-sources Cybersecurity Skills**: a knowledge base spanning 26 domains and 754 structured security skills https://www.bestblogs.dev/status/2058414217162895622?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: It converts fragmented security expertise (e.g., OWASP Top 10) into atomic, agent-callable skills (e.g., `scan_spring_boot_actuator`)—achieving, for the first time, 'orchestratable, verifiable, and auditable' cybersecurity capabilities, breaking down traditional security toolchain silos. — *Potential applications*: Security engineers can import this knowledge base into local Claude Code and trigger automated skill orchestration with `/goal "perform full-stack penetration testing on a Spring Boot app"`; product teams can build 'Red Team Agent-as-a-Service', where customers submit an application URL and receive an automated penetration report—including PoC validation. 6. **OPPO ColorOS 16 delivers 'Invisible AI'**: deep system integration of lightweight intent-recognition models + agent workflows + RAG knowledge bases https://www.bestblogs.dev/article/c5f3c6a3?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: Mobile AI fully abandons pop-ups, dedicated entry points, and explicit commands—instead operating silently at the OS layer to sense user context (e.g., calendar meetings, email attachments, clipboard content), then triggering lightweight models to dispatch agents for RAG retrieval and workflow execution—achieving true 'existence-as-a-service'. — *Potential applications*: App developers can integrate the ColorOS AI SDK and register event hooks like `onMeetingStart` to auto-trigger local agents that summarize meetings and push notes to Feishu; product teams can replicate this pattern—for e-commerce apps, enabling seamless 'browse product page → auto-price-compare → generate decision summary' shopping assistance. 7. **Xiaomi MiMo-V2.5 API cuts prices up to 99%, anchoring at ¥0.025 per million tokens** https://www.bestblogs.dev/article/6c061586?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: This isn't price competition—it's engineering extremism. By combining OSCAR 2-bit KV cache quantization, high-frequency cache hits, and mixed-precision inference, Xiaomi pushes inference cost to its physical lower bound—forcing every agent product to rebuild its cost model from the ground up. — *Potential applications*: Startups can rapidly prototype MVPs using Xiaomi's API and empirically test whether 'processing one customer support ticket' costs less than 1/10 the labor cost; product teams can implement 'token prepayment + overage circuit-breaking'—automatically downgrading to rule-based engines when per-dialogue token usage exceeds thresholds, ensuring SLA compliance. 8. **Alook open-sources a CLI Agent collaboration platform**: supports role assignment, email-style communication, and shared memory for AI team orchestration https://www.bestblogs.dev/status/2059729329119006928?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: Elevates monolithic agents like Claude Code into *manageable organizations*. Using email protocols to simulate human collaboration (e.g., `reviewer@ai` automatically executing code reviews upon receiving a PR), it achieves, for the first time, 'AI team autonomy' at the CLI layer. — *Potential applications*: Developers can clone the Alook repo, configure two roles (`claude@ai`, `tester@ai`), and launch cross-role collaboration with `alook run --task "fix login page XSS vulnerability"`; product teams can embed it into DevOps pipelines—triggering GitHub Actions to automatically dispatch tasks to AI teams for code scanning, testing, and documentation updates. 9. **Codex adds `/side` command for side-panel conversations**, enabling real-time monitoring of long-running `/goal` tasks without interrupting the main session https://www.bestblogs.dev/status/2058612576775229669?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: Solves the fundamental observability gap in multi-agent workflows—`/side` creates an isolated contextual channel, allowing users to debug primary logic while simultaneously viewing live logs and resource consumption of background tasks (e.g., 'deploy to staging'), dramatically improving control over complex development flows. — *Potential applications*: Programmers in VS Code can use the Codex extension to type `/side /goal deploy-staging`, opening a persistent sidebar showing live deployment progress; product teams can build an 'AI Engineer Dashboard' aggregating all `/side` outputs from `/goal` tasks—creating a unified, team-level view of agent operations. 10. **Qwen3.7 Max ranks #2 globally on Vibe Coding**, outperforming Claude and Gemini in real-world coding experience scenarios https://www.bestblogs.dev/article/392bb55d?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item *Core insight*: Vibe Coding evaluates *developer subjective experience*—including intent understanding accuracy, context retention duration, and naturalness of error recovery. Qwen3.7 Max's top-tier ranking confirms that domestic models have established a generational advantage in human-AI collaborative intuition. — *Potential applications*: Frontend developers can use Qwen

The new Claude Code /usage command launches—marking the first production-grade, token-level granular tracking of consumption across four agent capability types: Skills, Agents, MCPs, and Plugins—ushering AI engineering into the era of 'measurable cost'.
DeepSeek V4-Pro API permanently drops to 25% of its original price; Xiaomi's MiMo-V2.5 slashes inference cost to ¥0.025 per million tokens—launching China's domestic large model 'token price war', where inference efficiency has become a product's make-or-break factor.
Baibu Intelligence releases ForgeTrain, the world's first production-grade pretraining framework written entirely by AI, and uses it to train the MiniCPM5-1B model—signaling that 'AI building AI' has matured from conceptual validation to full engineering closure.
Huawei proposes a new chip evolution paradigm—the 'Tao Law'—replacing transistor density with time constant τ as the core metric, and redefining compute advancement through logic folding and 3D stacking—challenging the foundational logic of Moore's Law.
Anthropic open-sources the Cybersecurity Skills project (754 structured security skills), delivering the first executable, verifiable, and composable real-world cybersecurity knowledge base for AI agents—shifting agent capability from 'can do' to 'should do'.
OPPO ColorOS 16 and vivo's official AI shopping assistant jointly validate the path toward 'invisible intelligence': lightweight intent-recognition models + agent workflows + RAG knowledge bases—confirming that the ultimate form of mobile AI has transcended explicit interaction.

Hot Topics List

Claude Code adds /usage command for detailed token-consumption reporting across Skills, Agents, MCPs, and Plugins
https://www.bestblogs.dev/status/2057584283448205353?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: This is the first production-grade token metering tool designed explicitly for multi-agent architectures—not single API calls. It transforms abstract capabilities like Skills into auditable, attributable, and optimizable cost units—directly enabling enterprise-grade ROI calculation and budget governance for agent deployments.
— Potential applications: Individual developers can immediately run /usage in their local Codex environment to inspect token distribution across Skills in the current session; product teams can leverage this data to design SaaS pricing models—e.g., packaging the 'code review Skill' as a standalone paid plugin and auto-generating customer invoices using /usage output.
DeepSeek V4-Pro API permanently reduced to 25% of original price; launches 'Harness' engineering initiative to match Claude Code
https://www.bestblogs.dev/article/5e68673c?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: This price cut isn't a promotional tactic—it's a strategic lever to force deep engineering re-architecture. Optimized specifically for agent workloads, V4-Pro—paired with the Harness initiative—aims to build a Chinese-native Claude Code toolchain, shifting the battleground for programming agents from raw model parameters to system-level usability.
— Potential applications: Developers can instantly compare V4-Pro against original V4 via curl, measuring latency and token usage under /goal-driven multi-task scenarios; product teams can rapidly deploy domain-specific CLI agents (e.g., legal contract review) leveraging V4-Pro's low-cost, high-concurrency strengths—and monitor progress in real time using /side.
Baibu Intelligence releases ForgeTrain: the world's first production-grade pretraining framework authored by AI, used to train MiniCPM5-1B
https://www.bestblogs.dev/article/1ac2cf11?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: AI is no longer just the output of training—it's becoming the builder of infrastructure. ForgeTrain automatically generates distributed training logic, gradient synchronization strategies, and fault-recovery mechanisms—achieving 1B-model training efficiency surpassing Megatron, thereby validating the feasibility of 'AI infrastructure bootstrapping'.
— Potential applications: Individual developers can clone the ForgeTrain repo and reproduce MiniCPM5-1B's lightweight training on Colab using a single A10G GPU; product teams can package ForgeTrain as a 'Model Factory' SaaS—enabling customers to upload proprietary data and generate custom small models with one click.
Huawei formally introduces the 'Tao Law', redefining chip evolution around time constant τ
https://www.bestblogs.dev/article/083617dd?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: Abandoning the transistor-scaling path, Huawei instead optimizes the four-dimensional trade-off among performance, power, area, and latency via logic folding, 3D stacking, and heterogeneous integration. τ emerges as a measurable, comparable new benchmark for chip competitiveness—offering China a non-size-dependent breakthrough path toward sovereign compute. — Potential applications: Hardware engineers can use Huawei's Ascend SDK to measure τ (e.g., LLM inference latency per watt) across chips and plot Pareto frontiers of 'τ vs. accuracy'; product teams can redefine device specs around τ—e.g., marketing next-gen flagship phones with 'LLM response τ < 800ms on-device' as a core selling point.
Anthropic open-sources Cybersecurity Skills: a knowledge base spanning 26 domains and 754 structured security skills
https://www.bestblogs.dev/status/2058414217162895622?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: It converts fragmented security expertise (e.g., OWASP Top 10) into atomic, agent-callable skills (e.g., scan_spring_boot_actuator)—achieving, for the first time, 'orchestratable, verifiable, and auditable' cybersecurity capabilities, breaking down traditional security toolchain silos.
— Potential applications: Security engineers can import this knowledge base into local Claude Code and trigger automated skill orchestration with /goal "perform full-stack penetration testing on a Spring Boot app"; product teams can build 'Red Team Agent-as-a-Service', where customers submit an application URL and receive an automated penetration report—including PoC validation.
OPPO ColorOS 16 delivers 'Invisible AI': deep system integration of lightweight intent-recognition models + agent workflows + RAG knowledge bases
https://www.bestblogs.dev/article/c5f3c6a3?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: Mobile AI fully abandons pop-ups, dedicated entry points, and explicit commands—instead operating silently at the OS layer to sense user context (e.g., calendar meetings, email attachments, clipboard content), then triggering lightweight models to dispatch agents for RAG retrieval and workflow execution—achieving true 'existence-as-a-service'.
— Potential applications: App developers can integrate the ColorOS AI SDK and register event hooks like onMeetingStart to auto-trigger local agents that summarize meetings and push notes to Feishu; product teams can replicate this pattern—for e-commerce apps, enabling seamless 'browse product page → auto-price-compare → generate decision summary' shopping assistance.
Xiaomi MiMo-V2.5 API cuts prices up to 99%, anchoring at ¥0.025 per million tokens
https://www.bestblogs.dev/article/6c061586?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: This isn't price competition—it's engineering extremism. By combining OSCAR 2-bit KV cache quantization, high-frequency cache hits, and mixed-precision inference, Xiaomi pushes inference cost to its physical lower bound—forcing every agent product to rebuild its cost model from the ground up.
— Potential applications: Startups can rapidly prototype MVPs using Xiaomi's API and empirically test whether 'processing one customer support ticket' costs less than 1/10 the labor cost; product teams can implement 'token prepayment + overage circuit-breaking'—automatically downgrading to rule-based engines when per-dialogue token usage exceeds thresholds, ensuring SLA compliance.
Alook open-sources a CLI Agent collaboration platform: supports role assignment, email-style communication, and shared memory for AI team orchestration
https://www.bestblogs.dev/status/2059729329119006928?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: Elevates monolithic agents like Claude Code into manageable organizations. Using email protocols to simulate human collaboration (e.g., reviewer@ai automatically executing code reviews upon receiving a PR), it achieves, for the first time, 'AI team autonomy' at the CLI layer.
— Potential applications: Developers can clone the Alook repo, configure two roles (claude@ai, tester@ai), and launch cross-role collaboration with alook run --task "fix login page XSS vulnerability"; product teams can embed it into DevOps pipelines—triggering GitHub Actions to automatically dispatch tasks to AI teams for code scanning, testing, and documentation updates.
Codex adds /side command for side-panel conversations, enabling real-time monitoring of long-running /goal tasks without interrupting the main session
https://www.bestblogs.dev/status/2058612576775229669?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: Solves the fundamental observability gap in multi-agent workflows—/side creates an isolated contextual channel, allowing users to debug primary logic while simultaneously viewing live logs and resource consumption of background tasks (e.g., 'deploy to staging'), dramatically improving control over complex development flows.
— Potential applications: Programmers in VS Code can use the Codex extension to type /side /goal deploy-staging, opening a persistent sidebar showing live deployment progress; product teams can build an 'AI Engineer Dashboard' aggregating all /side outputs from /goal tasks—creating a unified, team-level view of agent operations.
Qwen3.7 Max ranks #2 globally on Vibe Coding, outperforming Claude and Gemini in real-world coding experience scenarios
https://www.bestblogs.dev/article/392bb55d?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core insight: Vibe Coding evaluates developer subjective experience—including intent understanding accuracy, context retention duration, and naturalness of error recovery. Qwen3.7 Max's top-tier ranking confirms that domestic models have established a generational advantage in human-AI collaborative intuition.
— Potential applications: Frontend developers can use Qwen

← Back to Updates

Weekly AI Highlights · May 29, 2026

Hot Topics List

🔗 Primary Sources