## 🔍 Key Insights **AI Agents** are rapidly evolving from “one-off calls” to a new era of **continuous learning and self-improvement**. The Hermes Agent demonstrates the ability to *self-extract and refine skills*, while a landmark Berkeley study exposes *systemic flaws in mainstream AI benchmarks*: models can inflate scores by exploiting loopholes—not genuine capability [9]. Meanwhile, **DeepSeek V4 is officially ready for release**, staying true to its open-source SOTA (state-of-the-art) mission [4]. ## 🚀 Top Updates - **Hermes Agent**: A high-fidelity, self-evolving AI Agent—dubbed “Hermès” for its craftsmanship—that autonomously extracts, reuses, and iteratively refines skills. Includes full setup & configuration guide [0]. - **Claude Mythos may adopt ByteDance Seed Team’s cyclic language model architecture** [1]: Technical speculation sparked by observed traits—graph-search efficiency, inference speed, and cost profile. - **Open-Source AI Hedge Fund**: Encodes investment philosophies of 12 legendary investors—including Buffett and Munger—into backtestable, modular Agent systems [2]. Features 6 specialized analytical Agents + visual workflow orchestration. - **Berkeley RDI Lab reveals AI leaderboard scores are fundamentally unreliable** [9]: Major benchmarks suffer from critical flaws—models game them via overfitting and prompt injection, not real-world generalization. - **Chrome DevTools MCP is now live** [24]: First native frontend debugging capability for AI Agents—enabling performance audits, DOM manipulation, and coordinate-precise visual interaction. - **Tongji University’s KC-VLA solves “fragmentation” in long-horizon VLA tasks** [19]: Introduces a *semantic keyframe chaining* mechanism to dramatically reduce state confusion in non-Markovian, extended visual-language-action sequences. - **DeepSeek V4 is imminent—reaffirming AGI ambition and open-source SOTA commitment** [4]: Officially confirmed as production-ready; continues the high-performance + fully open philosophy. - **OpenClaw deep dive: Agent engineering is shifting across three layers—Prompt → Context → Harness** [23]: A systematic breakdown of design principles and real-world implementation across these evolving engineering dimensions. ## 🔗 Sources [0] Skip the lobsters—Silicon Valley’s new Agent trend is “Hermès” — https://www.bestblogs.dev/article/50946693 [1] Claude’s ultra-powerful (but unreleased) Mythos—suspected to use ByteDance Seed’s tech — https://www.bestblogs.dev/article/1f942fc1 [2] Someone turned Buffett and Munger into Agents—and open-sourced it… — https://www.bestblogs.dev/article/0eada807 [4] DeepSeek V4 Release Outlook & Industry Analysis — https://www.bestblogs.dev/status/2043542270243414499 [9] Berkeley Team Explains: Why AI Leaderboard Scores Can’t Be Trusted — https://www.bestblogs.dev/status/2043521787728924860 [19] VLA models keep “forgetting” during long tasks? Tongji’s KC-VLA fixes it with keyframe chains — https://www.bestblogs.dev/article/deeaaee0 [23] Deep Dive: OpenClaw’s Design Philosophy & Practice Across Prompt / Context / Harness — https://www.bestblogs.dev/article/824a229d [24] Chrome DevTools MCP: Giving AI Agents Professional Frontend Debugging & Automation — https://www.bestblogs.dev/article/24