Author: RadarAI Editorial
Editor: RadarAI Editorial
Last updated: 2026-05-23
Review status: Editorial review pending
Weekly report
周报
官方
AI热点
Anthropic surpasses OpenAI with a $90B valuation and achieves profitability two years ahead of schedule—marking the first major LLM company to enter public-market valuation validation.
Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.
## This Week in Summary
- Anthropic has surpassed OpenAI in valuation—reaching $900 billion—and achieved profitability two years ahead of schedule, marking the formal entry of large-model companies into secondary-market value validation.
- At Google I/O 2026, the company fully pivoted to an *agent-native* paradigm. Four foundational pillars launched simultaneously: **Gemini Omni** (a world model), **Gemini 3.5 Flash** (87ms on-device inference), **Antigravity 2.0** (a visual orchestration platform), and **Gemini Spark** (a 7×24 personal intelligent agent)—together defining a new standard for system-level agent infrastructure.
- Tencent launched **Marvis**, the first personal AI scheduler deeply integrated with the OS kernel—enabling natural-language control over file search, system configuration, and cross-device operations. AI is no longer just a “conversational interface,” but now functions as a true “task orchestrator.”
- **Ring-2.6-1T**, a trillion-parameter open-source model, was released—focused on agent execution, multi-tiered inference, and asynchronous reinforcement learning. China’s AI industry is shifting from “bigger parameters” toward “stronger reasoning + real-world execution” to tackle genuinely complex tasks.
- Dual evaluation frameworks are now converging: **Token Economics** (championed by Jensen Huang) and **DAA (Daily Active Agents)** (proposed by Robin Li). This marks a critical upgrade in industry metrics: Token Economics anchors cost to compute investment; DAA measures actual, sustained agent utility. Together, they form a synergistic health dashboard for the AI ecosystem.
- Elon Musk, OpenAI, and Anthropic all reached the same conclusion: *Not building your own coding agent means forfeiting high-quality process supervision data—and thus losing the core engine for continuous model evolution.*
## Hot Topics
1. **Gemini Omni Launch**: The first end-to-end trained multimodal world model
https://www.bestblogs.dev/article/1d51b31d?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
**Core idea**: For the first time, it jointly models physical, social, and digital spaces—and performs causal reasoning across them. It ingests real-time data streams from Google Search and Maps, signaling a shift from “perception + generation” to “world understanding + proactive intervention.” Its impact isn’t incremental—it’s infrastructural: rewriting the logic underpinning how information systems operate.
— **Try it**: Developers can immediately replace existing RAG pipelines with the Gemini Omni API to build local agents that reason across space and time (e.g., *“Analyze foot traffic changes in Tongzhou, Beijing over the past 3 months → correlate with Line 17 subway construction progress → forecast offline conversion rates for the 618 shopping festival”*).
**Validation**: Use the `gemini-omni` model to call both `search` and `maps` tools in one chain, perform cross-modal causal attribution, and record end-to-end latency + attribution confidence score.
2. **Tencent Marvis Launch**: The first personal AI scheduler deeply integrated with the OS kernel
https://www.bestblogs.dev/article/9aef4fe3?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
**Core idea**: Six built-in agents connect directly to OS kernel interfaces—no app permissions or context switching required—to execute file search, system settings changes, and cross-device control. This delivers true “speak-and-execute” functionality, ushering in an era of OS-level AI orchestration—and ending the age of siloed, app-by-app interaction.
— **Try it**: Windows/macOS developers can replicate a lightweight kernel proxy layer using Tencent Cloud’s public architecture docs (on GitHub). Write a minimal viable kernel module in Rust—e.g., `marvis-syscall-proxy`—that listens on `/dev/marvis`, then maps voice commands like *“Close all Chrome tabs”* to native syscalls like `killall chrome`.
**Validation**: Load the module on macOS via `kextload` and complete three zero-context-switch command loops.
3. Ring-2.6-1T Officially Open-Sourced: A Trillion-Parameter Reasoning Model Built for Real-World Complexity
https://www.bestblogs.dev/article/2e577b36?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
**Core idea:** Ant Group’s Bailin released Ring-2.6-1T—the first open-source model to integrate an Agent execution engine, multi-level reasoning intensity control, and an asynchronous reinforcement learning framework. It’s designed specifically for long-horizon, multi-step, cross-tool business tasks—like diagnosing supply chain anomalies, auto-triggering restocking, and renegotiating contracts—filling a critical gap in domestic models’ *closed-loop execution* capability.
— **Practical use case:** Enterprise developers can download the 8-bit quantized version of Ring-2.6-1T and deploy it locally. Then connect it to internal ERP/CRM APIs to build a “Sales Receivables Anomaly Root-Cause Analysis Agent”: input an overdue order ID → automatically check payment terms, compare logistics tracking, pull customer support tickets → output a root-cause report and trigger collections workflows.
**Validation:** Test on 10 real overdue cases. Measure task completion rate and alignment with human expert review—target ≥85%.
4. Anthropic Valued at $90B—Surpassing OpenAI
https://www.bestblogs.dev/article/742a688a?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
**Core idea:** This valuation leap reflects a market-wide re-pricing of “AI-native moats”: domain expertise, user-data flywheels, and workflow lock-in have overtaken raw parameter count as the primary value drivers. Anthropic’s early profitability further proves that vertically integrated, closed-loop AI applications deliver strong monetization potential.
— **Practical use case:** Founders should stop building generic features—and instead use Anthropic’s Claude Code Harness framework to build a *minimal closed loop* for one narrow use case (e.g., law firm contract review). Constrain inputs (PDF contracts only), standardize outputs (list of risky clauses + revision suggestions), and preload three private knowledge sources (past court rulings, firm SOPs, latest judicial interpretations).
**Validation:** Invite 5 law firms to trial it for one week. Track average time saved per contract and % of AI-suggested edits adopted.
5. Codex Integrated into ChatGPT Mobile App—Now Supports Remote Monitoring & Approval
https://www.bestblogs.dev/status/2055042674365976587?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
**Core idea:** Codex has moved code execution from cloud sandboxes to users’ local devices. Now, via mobile, you can pause, inspect diffs, or approve permissions in real time—delivering the first truly *human-in-the-loop*, lightweight, and secure mobile coding collaboration experience. It shifts sovereignty back to developers.
— **Practical use case:** Frontend engineers can enable Codex directly in the iOS ChatGPT app and issue a natural-language command like: *“Fix all TypeScript type-check errors in this GitHub PR and generate unit tests.”* Observe whether it runs `tsc --noEmit` and `jest --coverage` locally—and returns a clean diff patch.
**Validation:** Compare Codex on mobile vs. the VS Code extension on the same PR: measure fix accuracy and number of required approval steps.
6. Kimi Web Bridge Launches: Enabling Agents to Operate Browsers Like Humans!
https://www.bestblogs.dev/article/2e577b36?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: The first browser-level agent infrastructure open to the entire ecosystem—offering real DOM access, JavaScript execution, form filling, and event simulation. It breaks through UI-layer interaction bottlenecks, allowing agents to natively complete full tasks requiring visual context—e.g., “log into online banking → check balance → export CSV.”
— Example use case: An independent developer could use the Web Bridge SDK to quickly build a “Social Security & Housing Fund Annual Audit Assistant”: input an ID number → automatically navigate to HR bureaus’ official websites across provinces → simulate clicking “Annual Declaration” → enter CAPTCHA → download PDF receipt.
Validation: Run end-to-end automation on three provincial HR bureau sites (e.g., Guangdong, Zhejiang, Sichuan); record success rate and average runtime.
7. Anthropic Launches Claude AI Assistant for Small Businesses
https://www.bestblogs.dev/article/910b0b6b?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: Focused on automating document processing (contracts, invoices, reports) and customer communication (email, WhatsApp, WeChat), it delivers pre-built workflow templates and a low-code configuration interface—lowering the AI adoption barrier so SMB owners can set things up themselves. It directly addresses the long-tail market’s “need without tech capacity” pain point.
— Example use case: A sole proprietor signs up for Claude Business, uploads five past procurement contracts → enables the “auto-extract payment terms + WeChat reminder 7 days before due date” feature in the “Contract Management” template.
Validation: Set up three mock contracts with due dates on next Wednesday, the 5th of next month, and the last day of next quarter; verify whether reminders are delivered on time to the designated WeChat account.
8. WeRead Introduces Agent Skill: AI Agents Can Now Read, Parse, and Reason Over E-Books
https://www.bestblogs.dev/status/2055865535804629132?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: The first content platform to deeply embed AI agents into the reading loop—supporting book-wide semantic Q&A, blind-spot analysis, and mental model extraction. This shifts content services from “information delivery” to “cognitive enhancement,” with user data fueling personalized cognitive map building.
— Example use case: Educators can combine WeRead’s API with Agent Skill to build a “Chapter-Level Socratic Training Pack” for *Sapiens*: after reading each chapter, the agent automatically serves three Socratic questions (e.g., “Harari calls the Agricultural Revolution ‘the greatest fraud in history.’ Using evidence from the book, argue against this claim.”).
Validation: Invite 20 university students to use this feature for three chapters; compare their final exam scores on critical-thinking questions against a control group using traditional note-taking.
9. Vercel Labs Releases Zero: A Programming Language Designed Specifically for AI Agents
https://www.bestblogs.dev/article/1d51b31d?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: Built for being smaller, faster, and easier to debug—its syntax is purpose-built for agent state management, tool-calling pipelines, and error recovery. It targets foundational inefficiencies in agent development: hard debugging, state drift, and irreversible failures. Zero is the first programming language truly designed for the agentic paradigm.
— Example use case: An agent engineer downloads the Zero compiler and rewrites an existing Python-based agent (e.g., a weather bot), focusing on upgrading its “API failure → auto-switch to fallback source → serve cached degraded response” logic.
Validation: Inject network instability (e.g., `tc qdisc add dev lo root netem delay 5000ms loss 30%`) and compare resilience metrics between the Python and Zero versions.
- Anthropic has surpassed OpenAI in valuation—reaching $900 billion—and achieved profitability two years ahead of schedule, marking the formal entry of large-model companies into secondary-market value validation.
- At Google I/O 2026, the company fully pivoted to an agent-native paradigm. Four foundational pillars launched simultaneously: Gemini Omni (a world model), Gemini 3.5 Flash (87ms on-device inference), Antigravity 2.0 (a visual orchestration platform), and Gemini Spark (a 7×24 personal intelligent agent)—together defining a new standard for system-level agent infrastructure.
- Tencent launched Marvis, the first personal AI scheduler deeply integrated with the OS kernel—enabling natural-language control over file search, system configuration, and cross-device operations. AI is no longer just a “conversational interface,” but now functions as a true “task orchestrator.”
- Ring-2.6-1T, a trillion-parameter open-source model, was released—focused on agent execution, multi-tiered inference, and asynchronous reinforcement learning. China’s AI industry is shifting from “bigger parameters” toward “stronger reasoning + real-world execution” to tackle genuinely complex tasks.
- Dual evaluation frameworks are now converging: Token Economics (championed by Jensen Huang) and DAA (Daily Active Agents) (proposed by Robin Li). This marks a critical upgrade in industry metrics: Token Economics anchors cost to compute investment; DAA measures actual, sustained agent utility. Together, they form a synergistic health dashboard for the AI ecosystem.
- Elon Musk, OpenAI, and Anthropic all reached the same conclusion: Not building your own coding agent means forfeiting high-quality process supervision data—and thus losing the core engine for continuous model evolution.
Hot Topics
-
Gemini Omni Launch: The first end-to-end trained multimodal world model
https://www.bestblogs.dev/article/1d51b31d?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: For the first time, it jointly models physical, social, and digital spaces—and performs causal reasoning across them. It ingests real-time data streams from Google Search and Maps, signaling a shift from “perception + generation” to “world understanding + proactive intervention.” Its impact isn’t incremental—it’s infrastructural: rewriting the logic underpinning how information systems operate.
— Try it: Developers can immediately replace existing RAG pipelines with the Gemini Omni API to build local agents that reason across space and time (e.g., “Analyze foot traffic changes in Tongzhou, Beijing over the past 3 months → correlate with Line 17 subway construction progress → forecast offline conversion rates for the 618 shopping festival”).
Validation: Use the gemini-omni model to call both search and maps tools in one chain, perform cross-modal causal attribution, and record end-to-end latency + attribution confidence score.
-
Tencent Marvis Launch: The first personal AI scheduler deeply integrated with the OS kernel
https://www.bestblogs.dev/article/9aef4fe3?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: Six built-in agents connect directly to OS kernel interfaces—no app permissions or context switching required—to execute file search, system settings changes, and cross-device control. This delivers true “speak-and-execute” functionality, ushering in an era of OS-level AI orchestration—and ending the age of siloed, app-by-app interaction.
— Try it: Windows/macOS developers can replicate a lightweight kernel proxy layer using Tencent Cloud’s public architecture docs (on GitHub). Write a minimal viable kernel module in Rust—e.g., marvis-syscall-proxy—that listens on /dev/marvis, then maps voice commands like “Close all Chrome tabs” to native syscalls like killall chrome.
Validation: Load the module on macOS via kextload and complete three zero-context-switch command loops.
-
Ring-2.6-1T Officially Open-Sourced: A Trillion-Parameter Reasoning Model Built for Real-World Complexity
https://www.bestblogs.dev/article/2e577b36?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: Ant Group’s Bailin released Ring-2.6-1T—the first open-source model to integrate an Agent execution engine, multi-level reasoning intensity control, and an asynchronous reinforcement learning framework. It’s designed specifically for long-horizon, multi-step, cross-tool business tasks—like diagnosing supply chain anomalies, auto-triggering restocking, and renegotiating contracts—filling a critical gap in domestic models’ closed-loop execution capability.
— Practical use case: Enterprise developers can download the 8-bit quantized version of Ring-2.6-1T and deploy it locally. Then connect it to internal ERP/CRM APIs to build a “Sales Receivables Anomaly Root-Cause Analysis Agent”: input an overdue order ID → automatically check payment terms, compare logistics tracking, pull customer support tickets → output a root-cause report and trigger collections workflows.
Validation: Test on 10 real overdue cases. Measure task completion rate and alignment with human expert review—target ≥85%.
- Anthropic Valued at $90B—Surpassing OpenAI
https://www.bestblogs.dev/article/742a688a?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: This valuation leap reflects a market-wide re-pricing of “AI-native moats”: domain expertise, user-data flywheels, and workflow lock-in have overtaken raw parameter count as the primary value drivers. Anthropic’s early profitability further proves that vertically integrated, closed-loop AI applications deliver strong monetization potential.
— Practical use case: Founders should stop building generic features—and instead use Anthropic’s Claude Code Harness framework to build a minimal closed loop for one narrow use case (e.g., law firm contract review). Constrain inputs (PDF contracts only), standardize outputs (list of risky clauses + revision suggestions), and preload three private knowledge sources (past court rulings, firm SOPs, latest judicial interpretations).
Validation: Invite 5 law firms to trial it for one week. Track average time saved per contract and % of AI-suggested edits adopted.
- Codex Integrated into ChatGPT Mobile App—Now Supports Remote Monitoring & Approval
https://www.bestblogs.dev/status/2055042674365976587?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: Codex has moved code execution from cloud sandboxes to users’ local devices. Now, via mobile, you can pause, inspect diffs, or approve permissions in real time—delivering the first truly human-in-the-loop, lightweight, and secure mobile coding collaboration experience. It shifts sovereignty back to developers.
— Practical use case: Frontend engineers can enable Codex directly in the iOS ChatGPT app and issue a natural-language command like: “Fix all TypeScript type-check errors in this GitHub PR and generate unit tests.” Observe whether it runs tsc --noEmit and jest --coverage locally—and returns a clean diff patch.
Validation: Compare Codex on mobile vs. the VS Code extension on the same PR: measure fix accuracy and number of required approval steps.
-
Kimi Web Bridge Launches: Enabling Agents to Operate Browsers Like Humans!
https://www.bestblogs.dev/article/2e577b36?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: The first browser-level agent infrastructure open to the entire ecosystem—offering real DOM access, JavaScript execution, form filling, and event simulation. It breaks through UI-layer interaction bottlenecks, allowing agents to natively complete full tasks requiring visual context—e.g., “log into online banking → check balance → export CSV.”
— Example use case: An independent developer could use the Web Bridge SDK to quickly build a “Social Security & Housing Fund Annual Audit Assistant”: input an ID number → automatically navigate to HR bureaus’ official websites across provinces → simulate clicking “Annual Declaration” → enter CAPTCHA → download PDF receipt.
Validation: Run end-to-end automation on three provincial HR bureau sites (e.g., Guangdong, Zhejiang, Sichuan); record success rate and average runtime.
-
Anthropic Launches Claude AI Assistant for Small Businesses
https://www.bestblogs.dev/article/910b0b6b?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: Focused on automating document processing (contracts, invoices, reports) and customer communication (email, WhatsApp, WeChat), it delivers pre-built workflow templates and a low-code configuration interface—lowering the AI adoption barrier so SMB owners can set things up themselves. It directly addresses the long-tail market’s “need without tech capacity” pain point.
— Example use case: A sole proprietor signs up for Claude Business, uploads five past procurement contracts → enables the “auto-extract payment terms + WeChat reminder 7 days before due date” feature in the “Contract Management” template.
Validation: Set up three mock contracts with due dates on next Wednesday, the 5th of next month, and the last day of next quarter; verify whether reminders are delivered on time to the designated WeChat account.
-
WeRead Introduces Agent Skill: AI Agents Can Now Read, Parse, and Reason Over E-Books
https://www.bestblogs.dev/status/2055865535804629132?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: The first content platform to deeply embed AI agents into the reading loop—supporting book-wide semantic Q&A, blind-spot analysis, and mental model extraction. This shifts content services from “information delivery” to “cognitive enhancement,” with user data fueling personalized cognitive map building.
— Example use case: Educators can combine WeRead’s API with Agent Skill to build a “Chapter-Level Socratic Training Pack” for Sapiens: after reading each chapter, the agent automatically serves three Socratic questions (e.g., “Harari calls the Agricultural Revolution ‘the greatest fraud in history.’ Using evidence from the book, argue against this claim.”).
Validation: Invite 20 university students to use this feature for three chapters; compare their final exam scores on critical-thinking questions against a control group using traditional note-taking.
-
Vercel Labs Releases Zero: A Programming Language Designed Specifically for AI Agents
https://www.bestblogs.dev/article/1d51b31d?utm_source=rss&utm_medium=feed&utm_campaign=resources&entry=rss_article_item
Core idea: Built for being smaller, faster, and easier to debug—its syntax is purpose-built for agent state management, tool-calling pipelines, and error recovery. It targets foundational inefficiencies in agent development: hard debugging, state drift, and irreversible failures. Zero is the first programming language truly designed for the agentic paradigm.
— Example use case: An agent engineer downloads the Zero compiler and rewrites an existing Python-based agent (e.g., a weather bot), focusing on upgrading its “API failure → auto-switch to fallback source → serve cached degraded response” logic.
Validation: Inject network instability (e.g., tc qdisc add dev lo root netem delay 5000ms loss 30%) and compare resilience metrics between the Python and Zero versions.
← Back to Updates