December 2025
A candid reflection on 2025’s AI hype and developer FOMO, urging programmers to focus on fundamentals and shipping real features instead of chasing every new tool.
A candid analysis arguing OpenAI often launches breakthroughs but is quickly outpaced, and that its real advantage—and strategy—is owning the default chat experience rather than leading every capability.
Theo argues that OpenAI keeps launching breakthrough features only to be quickly surpassed by rivals like Google and Anthropic across coding, images, docs, tools, and browsing, while still winning overall by owning the default chat experience that keeps users from switching.
Overview of Claude Code’s latest update, covering sub-agents for asynchronous tasks, UltraThink deep reasoning mode, new LSP-powered code intelligence, Chrome/browser testing, Android and Slack integrations, status line and hotkey improvements, and expanded open-standard Skills.
Overview and live demo of Claude Code CLI subagents to parallelize research and implementation while keeping the main thread’s context window small for better, faster code changes.
A fast-paced breakdown of OpenRouter’s 2025 State of AI report, highlighting China’s surge in open-weight models, the dominance of roleplay and coding use cases, and the rapid shift to reasoning and agentic inference.
A hands-on review of Z AI’s GLM-4.7 shows strong gains in agentic coding and creative generation through browser OS voice controls, 3D printing and flight simulators, drum kit and game demos, and quirky website builds, with a few retries needed for complex tasks.
Live test of Anthropic’s new Claude Chrome extension shows it controlling the browser, integrating with Claude Code to build a local web app, and setting up Firebase with Google Auth while verifying data end-to-end.
A developer explains how burnout and skepticism turned into fully embracing AI tools for coding, showing practical agentic workflows that boost speed and satisfaction while noting limits and risks.
The hosts recap 2025’s biggest AI surprises and debate GPT‑5.2, Opus 4.5, and Gemini 3 Flash, then make enterprise and consumer predictions for 2026, discuss coding agents, costs and tooling, and whether we’re in an AI bubble.
The creator benchmarks a four‑Mac Studio cluster using Apple’s new RDMA over Thunderbolt and Exo 1.0 to shard and pipeline giant LLMs, showing major latency cuts and token‑rate gains versus Ethernet while comparing dense and MoE scaling, power draw, and current software limits.
A concise walkthrough of TanStack AI showing how to build a typesafe chat app with streaming responses and both server and client tools in just a few lines of React and server code.
Theo argues that despite strong benchmark scores, GPT-5.2 feels worse in real use—showing regressions in reasoning, instruction following, and speed compared to alternatives—and demonstrates with his own practical evaluations why benchmarks are becoming less meaningful.
A practical walkthrough of using Beads to give coding agents durable, shareable long‑term memory via a Git-backed issue graph and CLI/UI workflows.
Theo tests Anthropic's new Claude Code features—like async sub-agents and background tasks—finding impressive parallelism and potential but many frustrating bugs and UX rough edges.
Nik Pash argues that effective AI coding agents depend less on elaborate scaffolding and more on strong models trained via rigorous benchmarks and RL environments, sharing Cline’s lessons on building verifiers and automating real-world evals.
Hands-on first look at ZAI’s GLM‑4.6V multimodal model, testing its native tool use and impressive image‑to‑code abilities across UI replication, sketch‑to‑site, interactive edits, and a browser OS demo.
A fast-paced, practical tour of 50 AI use cases—from chat, memory, and web research to image/video generation, audio/music tools, slide decks, and shipping a paywalled mobile app—showing how to combine models and APIs to supercharge everyday work and content creation.
Theo explains Anthropic’s acquisition of Bun, why it aligns with Claude Code’s strategy, and why Bun will remain open-source while accelerating JS tooling for AI-driven development.
Roundup and benchmarks of December 2025 AI coding agents comparing Opus 4.5, Gemini 3.0 Pro, and GPT‑5.1 across popular harnesses, highlighting Opus’s consistency, Gemini’s harness sensitivity, and GPT‑5.1’s variability.