LLM Skirmish: AI Agents Battle in Real-Time Strategy Games by Writing Code
THE GIST: LLM Skirmish is a benchmark where LLMs play RTS games against each other by writing code.
TOON Compression: Token-Efficient JSON for LLM Input
THE GIST: TOON compression reduces LLM input tokens by ~40% while maintaining 74% accuracy compared to JSON's 70%.
Speech-to-Speech AI Outperforms Traditional Models in New Evaluation
THE GIST: Ultravox's speech-native model outperforms both frontier speech and text models in the AIEWF eval, suggesting speech-to-speech is the future for AI voice agents.
NVSHMEM Accelerates Long-Context LLM Training in JAX/XLA
THE GIST: Integrating NVSHMEM into XLA optimizes context parallelism, enabling faster training of long-context LLMs like Llama 3 with up to 256K tokens.
Vesper AI Memory System Achieves 48x Improvement in Answer Quality
THE GIST: Vesper, a new AI memory system for Claude Code, significantly improves answer quality and query performance through learning, not just remembering.
MichiAI: Full-Duplex Speech LLM Achieves ~75ms Latency
THE GIST: MichiAI, a speech LLM designed for full-duplex interaction, achieves approximately 75ms latency using flow matching and continuous embeddings.
Step 3.5 Flash LLM Claims Highest Intelligence Density with 11B Active Parameters
THE GIST: Step 3.5 Flash, a sparse Mixture of Experts LLM, activates only 11B of its 196B parameters, achieving high reasoning capabilities with exceptional efficiency.
Anthropic's 'Project Panama' Scanned Millions of Books for AI Training
THE GIST: Anthropic's 'Project Panama' involved scanning millions of books to train its AI model, raising copyright and ethical concerns.