Feb 17 AI Briefing · Issue #39
The Qwen 3.5 series has powerfully ignited the open-source LLM ecosystem—its 397B-parameter count, native multimodality, and MoE + Linear Attention architecture received full-stack Day-One support from NVIDIA, AMD, Ollama, ZenMux, LMSYS, and mlx-vlm; meanwhile, LlamaIndex accelerates its evolution into AI Agent infrastructure—replacing subscriptions with the $LLAMA token and upgrading PDF-to-Markdown/JSON parsing capabilities to strengthen agents' 'cognitive infrastructure.'
Editorial standards and source policy: Editorial standards, Team. Content links to primary sources; see Methodology.
## 🔍 Key Insights
**The Qwen 3.5 series** has powerfully ignited the open-source large language model (LLM) ecosystem. Its **397B parameters**, **native multimodality**, and **MoE + Linear Attention architecture** received full-stack Day-One support from NVIDIA, AMD, Ollama, ZenMux, LMSYS, and mlx-vlm. Meanwhile, **LlamaIndex** is rapidly evolving into foundational AI Agent infrastructure—replacing its subscription model with the **$LLAMA token**, and significantly upgrading its **PDF → Markdown/JSON parsing capability**, thereby reinforcing the 'cognitive infrastructure' underpinning intelligent agents.
## 🚀 Top Highlights
- **Qwen 3.5-397B-A17B launches on LMSYS Arena**: Alibaba's new natively multimodal open model is now live on LMSYS Arena, supporting benchmarking across three modalities: text, vision, and code.
- **NVIDIA and AMD jointly announce Day-One support for Qwen 3.5**: Both companies offer free APIs—NVIDIA via NeMo, AMD via SGLang/vLLM—and hardware acceleration on their respective Instinct GPUs.
- **Ollama Cloud and ZenMux roll out Qwen 3.5**: Ollama Cloud enables plug-and-play inference; ZenMux debuts **Qwen 3.5 Plus**, featuring a novel **Gated DeltaNet + Sparse MoE** architecture.
- **mlx-vlm v0.3.12 adds native macOS support**: For the first time, the Qwen 3.5 series now supports on-device visual-language model inference on Apple Silicon Macs.
- **LlamaCloud launches an enhanced PDF parsing engine**: Capable of high-fidelity extraction from complex PDFs—including tables and charts—with structured output in Markdown or JSON.
- **LlamaIndex adopts the $LLAMA token for unified API billing**: Moving away from monthly subscriptions, it introduces a universal token-based pricing model tailored for agent-driven API calls.
- **Google Antigravity releases a visual UI editor for building Agents**: Users simply screenshot a UI region and issue natural-language instructions to instantly reconstruct frontend interfaces in real time.
- **Fu Sheng publicly releases the OpenClaw open-source framework**, demonstrating personalized Lunar New Year greetings for **611 job-specific roles**: All generated in just four minutes—validating the 'individual-as-a-team' productivity paradigm enabled by AI assistants.
The Qwen 3.5 series has powerfully ignited the open-source large language model (LLM) ecosystem. Its 397B parameters, native multimodality, and MoE + Linear Attention architecture received full-stack Day-One support from NVIDIA, AMD, Ollama, ZenMux, LMSYS, and mlx-vlm. Meanwhile, LlamaIndex is rapidly evolving into foundational AI Agent infrastructure—replacing its subscription model with the $LLAMA token, and significantly upgrading its PDF → Markdown/JSON parsing capability, thereby reinforcing the 'cognitive infrastructure' underpinning intelligent agents.
🚀 Top Highlights
- Qwen 3.5-397B-A17B launches on LMSYS Arena: Alibaba's new natively multimodal open model is now live on LMSYS Arena, supporting benchmarking across three modalities: text, vision, and code.
- NVIDIA and AMD jointly announce Day-One support for Qwen 3.5: Both companies offer free APIs—NVIDIA via NeMo, AMD via SGLang/vLLM—and hardware acceleration on their respective Instinct GPUs.
- Ollama Cloud and ZenMux roll out Qwen 3.5: Ollama Cloud enables plug-and-play inference; ZenMux debuts Qwen 3.5 Plus, featuring a novel Gated DeltaNet + Sparse MoE architecture.
- mlx-vlm v0.3.12 adds native macOS support: For the first time, the Qwen 3.5 series now supports on-device visual-language model inference on Apple Silicon Macs.
- LlamaCloud launches an enhanced PDF parsing engine: Capable of high-fidelity extraction from complex PDFs—including tables and charts—with structured output in Markdown or JSON.
- LlamaIndex adopts the $LLAMA token for unified API billing: Moving away from monthly subscriptions, it introduces a universal token-based pricing model tailored for agent-driven API calls.
- Google Antigravity releases a visual UI editor for building Agents: Users simply screenshot a UI region and issue natural-language instructions to instantly reconstruct frontend interfaces in real time.
- Fu Sheng publicly releases the OpenClaw open-source framework, demonstrating personalized Lunar New Year greetings for 611 job-specific roles: All generated in just four minutes—validating the 'individual-as-a-team' productivity paradigm enabled by AI assistants.