Feb 17 AI Briefing · Issue #37
Alibaba officially open-sourced Qwen3.5-397B-A17B—the world's first natively multimodal, sparse Mixture-of-Experts (MoE) large language model, supporting 1M-token ultra-long context and 4-bit local inference on consumer-grade hardware. Meanwhile, Manus Agents launched long-term memory and toolchain integration on Telegram—marking AI assistants' evolution into the 'memorable and actionable' era.
Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.
## 🔍 Key Insights
Alibaba officially open-sourced **Qwen3.5-397B-A17B**—the world's first **natively multimodal, sparse MoE architecture** open-source LLM, supporting **1M-token ultra-long context** and **4-bit local inference on consumer-grade hardware**. Concurrently, **Manus Agents** rolled out long-term memory and integrated toolchains on Telegram—signaling a new era for AI assistants: 'memorable and actionable.'
## 🚀 Major Updates
- **Qwen3.5-397B-A17B officially open-sourced**: Alibaba released this 397B-parameter MoE model featuring native multimodal capabilities, empirically validated context length exceeding 262K tokens, and a hybrid linear attention architecture.
- **Qwen3.5 gains full-stack inference ecosystem support**: Immediately integrated with **vLLM, SGLang, Ollama Cloud, and NVIDIA NIM/NeMo**, enabling plug-and-play deployment and fine-tuning.
- **Unsloth AI enables consumer-grade local execution**: Delivers highly efficient 4-bit inference for Qwen3.5—enabling smooth operation of a hundred-billion-parameter multimodal model even on an RTX 4090.
- **Manus Agents launches on Telegram**: Supports **long-term memory**, integrations with **Gmail and Notion**, and multimodal content generation—the first lightweight, actionable AI assistant deployed in production.
- **Ant Group open-sources Ling-2.5-1T, a trillion-parameter model**: Features hybrid linear attention, supports **million-token long-text processing**, and delivers **millisecond-level real-time response**, expanding the frontier of inclusive intelligence.
- **OneVision-Encoder introduces codec-aligned sparsity**: A new paper establishes this principle as a foundational paradigm for multimodal intelligence—potentially unifying vision-language representation learning pathways.
- **GAPO algorithm solves RL programming noise challenges**: By leveraging HDI interval estimation and median-based advantage value modeling, GAPO significantly improves training robustness of code LLMs in complex, real-world contexts.
- **Qwen3.5 demonstrates strong agent engineering capability**: Generated a *Stardew Valley*-style game in a single file—and autonomously debugged code, fixed issues, and submitted a Pull Request—validating end-to-end software engineering closure.
Alibaba officially open-sourced Qwen3.5-397B-A17B—the world's first natively multimodal, sparse MoE architecture open-source LLM, supporting 1M-token ultra-long context and 4-bit local inference on consumer-grade hardware. Concurrently, Manus Agents rolled out long-term memory and integrated toolchains on Telegram—signaling a new era for AI assistants: 'memorable and actionable.'
🚀 Major Updates
- Qwen3.5-397B-A17B officially open-sourced: Alibaba released this 397B-parameter MoE model featuring native multimodal capabilities, empirically validated context length exceeding 262K tokens, and a hybrid linear attention architecture.
- Qwen3.5 gains full-stack inference ecosystem support: Immediately integrated with vLLM, SGLang, Ollama Cloud, and NVIDIA NIM/NeMo, enabling plug-and-play deployment and fine-tuning.
- Unsloth AI enables consumer-grade local execution: Delivers highly efficient 4-bit inference for Qwen3.5—enabling smooth operation of a hundred-billion-parameter multimodal model even on an RTX 4090.
- Manus Agents launches on Telegram: Supports long-term memory, integrations with Gmail and Notion, and multimodal content generation—the first lightweight, actionable AI assistant deployed in production.
- Ant Group open-sources Ling-2.5-1T, a trillion-parameter model: Features hybrid linear attention, supports million-token long-text processing, and delivers millisecond-level real-time response, expanding the frontier of inclusive intelligence.
- OneVision-Encoder introduces codec-aligned sparsity: A new paper establishes this principle as a foundational paradigm for multimodal intelligence—potentially unifying vision-language representation learning pathways.
- GAPO algorithm solves RL programming noise challenges: By leveraging HDI interval estimation and median-based advantage value modeling, GAPO significantly improves training robustness of code LLMs in complex, real-world contexts.
- Qwen3.5 demonstrates strong agent engineering capability: Generated a Stardew Valley-style game in a single file—and autonomously debugged code, fixed issues, and submitted a Pull Request—validating end-to-end software engineering closure.