Feb 17 AI Briefing · Issue #39

2026-02-17 16:00

Author: RadarAI Editorial Editor: RadarAI Editorial Last updated: 2026-05-11 Review status: Editorial review pending Brief 速报官方

The Qwen 3.5 series has powerfully ignited the open-source LLM ecosystem—its 397B-parameter count, native multimodality, and MoE + Linear Attention architecture received full-stack Day-One support from NVIDIA, AMD, Ollama, ZenMux, LMSYS, and mlx-vlm; meanwhile, LlamaIndex accelerates its evolution into AI Agent infrastructure—replacing subscriptions with the $LLAMA token and upgrading PDF-to-Markdown/JSON parsing capabilities to strengthen agents' 'cognitive infrastructure.'

Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.

## 🔍 Key Insights **The Qwen 3.5 series** has powerfully ignited the open-source large language model (LLM) ecosystem. Its **397B parameters**, **native multimodality**, and **MoE + Linear Attention architecture** received full-stack Day-One support from NVIDIA, AMD, Ollama, ZenMux, LMSYS, and mlx-vlm. Meanwhile, **LlamaIndex** is rapidly evolving into foundational AI Agent infrastructure—replacing its subscription model with the **$LLAMA token**, and significantly upgrading its **PDF → Markdown/JSON parsing capability**, thereby reinforcing the 'cognitive infrastructure' underpinning intelligent agents. ## 🚀 Top Highlights - **Qwen 3.5-397B-A17B launches on LMSYS Arena**: Alibaba's new natively multimodal open model is now live on LMSYS Arena, supporting benchmarking across three modalities: text, vision, and code. - **NVIDIA and AMD jointly announce Day-One support for Qwen 3.5**: Both companies offer free APIs—NVIDIA via NeMo, AMD via SGLang/vLLM—and hardware acceleration on their respective Instinct GPUs. - **Ollama Cloud and ZenMux roll out Qwen 3.5**: Ollama Cloud enables plug-and-play inference; ZenMux debuts **Qwen 3.5 Plus**, featuring a novel **Gated DeltaNet + Sparse MoE** architecture. - **mlx-vlm v0.3.12 adds native macOS support**: For the first time, the Qwen 3.5 series now supports on-device visual-language model inference on Apple Silicon Macs. - **LlamaCloud launches an enhanced PDF parsing engine**: Capable of high-fidelity extraction from complex PDFs—including tables and charts—with structured output in Markdown or JSON. - **LlamaIndex adopts the $LLAMA token for unified API billing**: Moving away from monthly subscriptions, it introduces a universal token-based pricing model tailored for agent-driven API calls. - **Google Antigravity releases a visual UI editor for building Agents**: Users simply screenshot a UI region and issue natural-language instructions to instantly reconstruct frontend interfaces in real time. - **Fu Sheng publicly releases the OpenClaw open-source framework**, demonstrating personalized Lunar New Year greetings for **611 job-specific roles**: All generated in just four minutes—validating the 'individual-as-a-team' productivity paradigm enabled by AI assistants.

The Qwen 3.5 series has powerfully ignited the open-source large language model (LLM) ecosystem. Its 397B parameters, native multimodality, and MoE + Linear Attention architecture received full-stack Day-One support from NVIDIA, AMD, Ollama, ZenMux, LMSYS, and mlx-vlm. Meanwhile, LlamaIndex is rapidly evolving into foundational AI Agent infrastructure—replacing its subscription model with the $LLAMA token, and significantly upgrading its PDF → Markdown/JSON parsing capability, thereby reinforcing the 'cognitive infrastructure' underpinning intelligent agents.

🚀 Top Highlights

Qwen 3.5-397B-A17B launches on LMSYS Arena: Alibaba's new natively multimodal open model is now live on LMSYS Arena, supporting benchmarking across three modalities: text, vision, and code.
NVIDIA and AMD jointly announce Day-One support for Qwen 3.5: Both companies offer free APIs—NVIDIA via NeMo, AMD via SGLang/vLLM—and hardware acceleration on their respective Instinct GPUs.
Ollama Cloud and ZenMux roll out Qwen 3.5: Ollama Cloud enables plug-and-play inference; ZenMux debuts Qwen 3.5 Plus, featuring a novel Gated DeltaNet + Sparse MoE architecture.
mlx-vlm v0.3.12 adds native macOS support: For the first time, the Qwen 3.5 series now supports on-device visual-language model inference on Apple Silicon Macs.
LlamaCloud launches an enhanced PDF parsing engine: Capable of high-fidelity extraction from complex PDFs—including tables and charts—with structured output in Markdown or JSON.
LlamaIndex adopts the $LLAMA token for unified API billing: Moving away from monthly subscriptions, it introduces a universal token-based pricing model tailored for agent-driven API calls.
Google Antigravity releases a visual UI editor for building Agents: Users simply screenshot a UI region and issue natural-language instructions to instantly reconstruct frontend interfaces in real time.
Fu Sheng publicly releases the OpenClaw open-source framework, demonstrating personalized Lunar New Year greetings for 611 job-specific roles: All generated in just four minutes—validating the 'individual-as-a-team' productivity paradigm enabled by AI assistants.

← Back to Updates

Feb 17 AI Briefing · Issue #39

🚀 Top Highlights

🔗 Primary Sources