OpenAI details how it rebuilt voice infrastructure for low-latency AI at scale
OpenAI published a technical breakdown of the infrastructure behind low-latency voice interactions in ChatGPT and the Realtime API.
OpenAI says the system now has to support more than 900 million weekly active users while keeping connection setup fast and media round-trip time, jitter, and packet loss low enough for natural turn-taking. The company says it rearchitected its WebRTC stack around a split relay plus transceiver design to preserve standard client-side WebRTC behavior while changing how packets are routed inside OpenAI’s infrastructure.
The post says the redesign was driven by three scaling constraints: one-port-per-session media termination did not fit OpenAI’s infrastructure well, stateful ICE and DTLS sessions needed stable ownership, and global routing had to keep first-hop latency low.
Source: https://openai.com/index/delivering-low-latency-voice-ai-at-scale