Back to Wire
AI Compute Crunch Intensifies: Anthropic and Alibaba Face Supply Shortages
LLMs

AI Compute Crunch Intensifies: Anthropic and Alibaba Face Supply Shortages

Source: Martinalderson Original Author: Martin Alderson 2 min read Intelligence Analysis by Gemini

Sonic Intelligence

00:00 / 00:00
Signal Summary

AI compute demand is outstripping supply, impacting major providers like Anthropic and Alibaba Cloud.

Explain Like I'm Five

"Imagine everyone wants to play with the coolest new toys (AI models), but there aren't enough toy factories (computer servers) to make them fast enough. So, some toy companies have to make their toys a bit simpler or take older toys away just so more kids can play, and sometimes the toys are slow. This means it's hard for everyone to get the best toys right now."

Original Reporting
Martinalderson

Read the original article for full context.

Read Article at Source

Deep Intelligence Analysis

The AI industry is confronting a significant compute crunch, a bottleneck that threatens to impede the rapid adoption and scaling of advanced AI models. Initial predictions of a "coming" shortage have proven understated, with evidence now indicating an active struggle among major providers to meet escalating demand. Anthropic, a prominent frontier AI lab, recently experienced severe uptime issues, at one point dropping to "one 9" reliability. This degradation was attributed by staff to "unprecedented growth" that was "genuinely hard to forecast." In response, Anthropic has reportedly taken corrective actions, including reducing the default effort for Opus 4.6, temporarily removing access to older Opus models (4, 4.1, Sonnet 4.5) from Claude Code, and disabling prompt suggestions. Such measures, particularly impacting a high-profile product like Claude Code, underscore the severity of the compute scarcity.

This issue is not isolated to Anthropic. The broader industry, as evidenced by reliability challenges on platforms like OpenRouter, suggests systemic compute contention. Alibaba Cloud's CEO, as early as November, acknowledged an inability to keep pace with customer demand for new server deployments, a situation that remains dire months later. The observed median output token/s of 6 for Alibaba Cloud's Qwen3.5 397B A17B model on OpenRouter further illustrates extreme inference resource contention. This data directly contradicts any "AI bubble" narrative suggesting an abundance of idle compute resources.

The surge in demand is significantly fueled by the "agentic inflection point," where models like Opus 4.6 and GPT 5.4 demonstrate advanced capabilities in complex tasks, particularly in software engineering and other professional services. These agentic processes inherently consume a substantially higher volume of tokens compared to traditional LLM uses, creating a compounding effect on compute requirements. Despite Anthropic's Claude Code generating an impressive $2.5 billion in annual run rate revenue, the underlying infrastructure is visibly strained. The current compute environment suggests that while AI capabilities are rapidly advancing, the physical infrastructure required to support widespread deployment is lagging, posing a critical challenge for the industry's near-term growth trajectory.
AI-assisted intelligence report · EU AI Act Art. 50 compliant

Impact Assessment

The escalating demand for AI compute, driven by advanced agentic models, is creating significant supply chain bottlenecks. This crunch could limit AI adoption and innovation, impacting the performance and availability of leading AI services.

Key Details

  • Anthropic's uptime dropped to 'one 9' last week, attributed to unprecedented growth.
  • Anthropic degraded products (e.g., reducing Opus 4.6 default effort, removing older Opus models from Claude Code) to free up compute.
  • Alibaba Cloud CEO stated in November they couldn't keep pace with customer demand for new servers, a situation that persists.
  • Alibaba Cloud's Qwen3.5 397B A17B model on OpenRouter shows 6tps median output token/s, indicating inference resource contention.
  • Anthropic's Claude Code has an annual run rate revenue of $2.5 billion.

Optimistic Outlook

The compute crunch signals robust demand and rapid AI advancement, potentially spurring massive investment in infrastructure and chip manufacturing. This could accelerate innovation in hardware efficiency and distributed computing solutions, ultimately strengthening the AI ecosystem.

Pessimistic Outlook

Persistent compute shortages could stifle AI growth, leading to degraded service quality, higher costs, and slower development cycles for new applications. This bottleneck might concentrate power among providers with existing compute resources, hindering broader market access and competition.

Stay on the wire

Get the next signal in your inbox.

One concise weekly briefing with direct source links, fast analysis, and no inbox clutter.

Free. Unsubscribe anytime.

Continue reading

More reporting around this signal.

Related coverage selected to keep the thread going without dropping you into another card wall.