AI Compute Crunch Intensifies: Anthropic and Alibaba Face Supply Shortages
Sonic Intelligence
AI compute demand is outstripping supply, impacting major providers like Anthropic and Alibaba Cloud.
Explain Like I'm Five
"Imagine everyone wants to play with the coolest new toys (AI models), but there aren't enough toy factories (computer servers) to make them fast enough. So, some toy companies have to make their toys a bit simpler or take older toys away just so more kids can play, and sometimes the toys are slow. This means it's hard for everyone to get the best toys right now."
Deep Intelligence Analysis
This issue is not isolated to Anthropic. The broader industry, as evidenced by reliability challenges on platforms like OpenRouter, suggests systemic compute contention. Alibaba Cloud's CEO, as early as November, acknowledged an inability to keep pace with customer demand for new server deployments, a situation that remains dire months later. The observed median output token/s of 6 for Alibaba Cloud's Qwen3.5 397B A17B model on OpenRouter further illustrates extreme inference resource contention. This data directly contradicts any "AI bubble" narrative suggesting an abundance of idle compute resources.
The surge in demand is significantly fueled by the "agentic inflection point," where models like Opus 4.6 and GPT 5.4 demonstrate advanced capabilities in complex tasks, particularly in software engineering and other professional services. These agentic processes inherently consume a substantially higher volume of tokens compared to traditional LLM uses, creating a compounding effect on compute requirements. Despite Anthropic's Claude Code generating an impressive $2.5 billion in annual run rate revenue, the underlying infrastructure is visibly strained. The current compute environment suggests that while AI capabilities are rapidly advancing, the physical infrastructure required to support widespread deployment is lagging, posing a critical challenge for the industry's near-term growth trajectory.
Impact Assessment
The escalating demand for AI compute, driven by advanced agentic models, is creating significant supply chain bottlenecks. This crunch could limit AI adoption and innovation, impacting the performance and availability of leading AI services.
Key Details
- Anthropic's uptime dropped to 'one 9' last week, attributed to unprecedented growth.
- Anthropic degraded products (e.g., reducing Opus 4.6 default effort, removing older Opus models from Claude Code) to free up compute.
- Alibaba Cloud CEO stated in November they couldn't keep pace with customer demand for new servers, a situation that persists.
- Alibaba Cloud's Qwen3.5 397B A17B model on OpenRouter shows 6tps median output token/s, indicating inference resource contention.
- Anthropic's Claude Code has an annual run rate revenue of $2.5 billion.
Optimistic Outlook
The compute crunch signals robust demand and rapid AI advancement, potentially spurring massive investment in infrastructure and chip manufacturing. This could accelerate innovation in hardware efficiency and distributed computing solutions, ultimately strengthening the AI ecosystem.
Pessimistic Outlook
Persistent compute shortages could stifle AI growth, leading to degraded service quality, higher costs, and slower development cycles for new applications. This bottleneck might concentrate power among providers with existing compute resources, hindering broader market access and competition.
Get the next signal in your inbox.
One concise weekly briefing with direct source links, fast analysis, and no inbox clutter.
More reporting around this signal.
Related coverage selected to keep the thread going without dropping you into another card wall.