Back to Wire

LLMs

AI Compute Crunch Intensifies: Anthropic and Alibaba Face Supply Shortages

Source: Martinalderson Original Author: Martin Alderson 2 min read Intelligence Analysis by Gemini

Sonic Intelligence

00:00 / 00:00

Signal Summary

AI compute demand is outstripping supply, impacting major providers like Anthropic and Alibaba Cloud.

Explain Like I'm Five

"Imagine everyone wants to play with the coolest new toys (AI models), but there aren't enough toy factories (computer servers) to make them fast enough. So, some toy companies have to make their toys a bit simpler or take older toys away just so more kids can play, and sometimes the toys are slow. This means it's hard for everyone to get the best toys right now."

Deep Intelligence Analysis

The AI industry is confronting a significant compute crunch, a bottleneck that threatens to impede the rapid adoption and scaling of advanced AI models. Initial predictions of a "coming" shortage have proven understated, with evidence now indicating an active struggle among major providers to meet escalating demand. Anthropic, a prominent frontier AI lab, recently experienced severe uptime issues, at one point dropping to "one 9" reliability. This degradation was attributed by staff to "unprecedented growth" that was "genuinely hard to forecast." In response, Anthropic has reportedly taken corrective actions, including reducing the default effort for Opus 4.6, temporarily removing access to older Opus models (4, 4.1, Sonnet 4.5) from Claude Code, and disabling prompt suggestions. Such measures, particularly impacting a high-profile product like Claude Code, underscore the severity of the compute scarcity.

This issue is not isolated to Anthropic. The broader industry, as evidenced by reliability challenges on platforms like OpenRouter, suggests systemic compute contention. Alibaba Cloud's CEO, as early as November, acknowledged an inability to keep pace with customer demand for new server deployments, a situation that remains dire months later. The observed median output token/s of 6 for Alibaba Cloud's Qwen3.5 397B A17B model on OpenRouter further illustrates extreme inference resource contention. This data directly contradicts any "AI bubble" narrative suggesting an abundance of idle compute resources.

The surge in demand is significantly fueled by the "agentic inflection point," where models like Opus 4.6 and GPT 5.4 demonstrate advanced capabilities in complex tasks, particularly in software engineering and other professional services. These agentic processes inherently consume a substantially higher volume of tokens compared to traditional LLM uses, creating a compounding effect on compute requirements. Despite Anthropic's Claude Code generating an impressive $2.5 billion in annual run rate revenue, the underlying infrastructure is visibly strained. The current compute environment suggests that while AI capabilities are rapidly advancing, the physical infrastructure required to support widespread deployment is lagging, posing a critical challenge for the industry's near-term growth trajectory.

AI-assisted intelligence report · EU AI Act Art. 50 compliant

Impact Assessment

The escalating demand for AI compute, driven by advanced agentic models, is creating significant supply chain bottlenecks. This crunch could limit AI adoption and innovation, impacting the performance and availability of leading AI services.

Key Details

Anthropic's uptime dropped to 'one 9' last week, attributed to unprecedented growth.
Anthropic degraded products (e.g., reducing Opus 4.6 default effort, removing older Opus models from Claude Code) to free up compute.
Alibaba Cloud CEO stated in November they couldn't keep pace with customer demand for new servers, a situation that persists.
Alibaba Cloud's Qwen3.5 397B A17B model on OpenRouter shows 6tps median output token/s, indicating inference resource contention.
Anthropic's Claude Code has an annual run rate revenue of $2.5 billion.

Optimistic Outlook

The compute crunch signals robust demand and rapid AI advancement, potentially spurring massive investment in infrastructure and chip manufacturing. This could accelerate innovation in hardware efficiency and distributed computing solutions, ultimately strengthening the AI ecosystem.

Pessimistic Outlook

Persistent compute shortages could stifle AI growth, leading to degraded service quality, higher costs, and slower development cycles for new applications. This bottleneck might concentrate power among providers with existing compute resources, hindering broader market access and competition.

More reporting around this signal.

Related coverage selected to keep the thread going without dropping you into another card wall.

LLMs

Anthropic's Claude Expands Personal App Integration with New Connectors

Claude now integrates with personal apps like Spotify and Uber, expanding its utility for users.

LLMs

New Benchmarking Method Harmonizes LLM Rankings

A novel 'Train-before-Test' method significantly improves LLM benchmark consistency.

LLMs

LLM Precision Discrepancies Pose Hidden Reliability Risks

LLMs exhibit hidden reliability risks due to precision-induced output disagreements.

Policy

Authors Guild Condemns Unauthorized Publisher AI Use of Copyrighted Works

Authors Guild criticizes publishers for unauthorized AI use of copyrighted manuscripts, citing privacy and copyright ris...

Tools

Jan.ai Emerges as Open-Source Alternative for Local LLM Deployment

Jan.ai offers a free, open-source platform for running local LLMs with strong privacy.

AI Agents

PayClaw Launches Gasless USDC Wallet for AI Agents on Base

PayClaw offers gasless USDC transactions for AI agents on Base.

AI Compute Crunch Intensifies: Anthropic and Alibaba Face Supply Shortages

Sonic Intelligence

Explain Like I'm Five

Deep Intelligence Analysis

Impact Assessment

Key Details

Optimistic Outlook

Pessimistic Outlook

Get the next signal in your inbox.

More reporting around this signal.

Anthropic's Claude Expands Personal App Integration with New Connectors

New Benchmarking Method Harmonizes LLM Rankings

LLM Precision Discrepancies Pose Hidden Reliability Risks

Authors Guild Condemns Unauthorized Publisher AI Use of Copyrighted Works

Jan.ai Emerges as Open-Source Alternative for Local LLM Deployment

PayClaw Launches Gasless USDC Wallet for AI Agents on Base