Back to Wire
Step 3.5 Flash LLM Claims Highest Intelligence Density with 11B Active Parameters
LLMs

Step 3.5 Flash LLM Claims Highest Intelligence Density with 11B Active Parameters

Source: Static 1 min read Intelligence Analysis by Gemini

Sonic Intelligence

00:00 / 00:00
Signal Summary

Step 3.5 Flash, a sparse Mixture of Experts LLM, activates only 11B of its 196B parameters, achieving high reasoning capabilities with exceptional efficiency.

Explain Like I'm Five

"Imagine a super smart robot that only uses a small part of its brain at a time to save energy! Step 3.5 Flash is like that, making it faster and cheaper to use."

Original Reporting
Static

Read the original article for full context.

Read Article at Source

Deep Intelligence Analysis

Step 3.5 Flash is an open-source foundation model engineered for frontier reasoning and agentic capabilities with exceptional efficiency. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token. This allows it to rival the reasoning depth of top-tier proprietary models, while maintaining the agility required for real-time interaction.

The model is purpose-built for agentic tasks, integrating a scalable RL framework that drives consistent self-improvement. It achieves 74.4% on SWE-bench Verified and 51.0% on Terminal-Bench 2.0, proving its ability to handle sophisticated, long-horizon tasks with unwavering stability. The model supports a cost-efficient 256K context window by employing a 3:1 Sliding Window Attention (SWA) ratio.

Step 3.5 Flash distinguishes itself through a unique "Think-and-Act" synergy in tool environments. Rather than merely executing commands, the model exhibits massive-scale orchestration and cross-domain precision. It maintains flawless intent-alignment even when navigating vast, high-density toolsets, and possesses the adaptive reasoning required to pivot seamlessly between raw code execution and specialized API protocols.
AI-assisted intelligence report · EU AI Act Art. 50 compliant

Impact Assessment

Step 3.5 Flash demonstrates the potential of sparse MoE architectures to deliver high performance with reduced computational cost. This could enable more accessible and efficient AI applications.

Key Details

  • Step 3.5 Flash activates only 11B of its 196B parameters per token.
  • The model achieves 74.4% on SWE-bench Verified and 51.0% on Terminal-Bench 2.0.
  • Step 3.5 Flash supports a 256K context window using a 3:1 Sliding Window Attention ratio.

Optimistic Outlook

The model's efficient long context and tool-use capabilities could lead to more powerful and versatile AI agents. Further development could enable AI systems that can seamlessly interact with the real world and solve complex problems.

Pessimistic Outlook

The reliance on specific benchmarks and the potential for overfitting to those benchmarks could limit the model's real-world applicability. Scalability and the ability to generalize across diverse tasks remain key challenges.

Stay on the wire

Get the next signal in your inbox.

One concise weekly briefing with direct source links, fast analysis, and no inbox clutter.

Free. Unsubscribe anytime.

Continue reading

More reporting around this signal.

Related coverage selected to keep the thread going without dropping you into another card wall.