OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip
One-liner: OpenAI and Broadcom introduced Jalapeño, OpenAI's first custom AI accelerator — a reticle-sized ASIC designed from the ground up for LLM inference, not repurposed from training hardware.
Key Facts
- Full-reticle ASIC purpose-built for LLM inference; designed in nine months using OpenAI's own AI models to accelerate chip development
- Developed with Broadcom (manufacturer) and Celestica (production scale-up) — early testing shows better performance per watt than current state-of-the-art chips
- Microsoft will purchase 40% of initial production; full deployment targeted for late 2026
- OpenAI aims for 10 gigawatts of Jalapeño-powered compute by 2029, roughly the output of ten nuclear reactors
Why It Matters
This is OpenAI's clearest move yet toward hardware self-sufficiency at the inference layer. Training still depends on Nvidia GPUs, but inference is where serving costs compound at scale — every ChatGPT query runs it. A purpose-built inference ASIC could meaningfully reshape OpenAI's unit economics as it approaches its IPO, while signaling that Nvidia's dominance at the edge of the AI stack is no longer guaranteed.