Back Issues

Advancing Voice Intelligence With New Models In The API

OpenAI, Thursday, May 7th, 2026

OpenAI releases three advanced audio models for developers to build natural, intelligent voice applications.

OpenAI introduced three new audio models - GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper - that enable developers to build sophisticated voice applications with real-time reasoning, translation, and transcription capabilities. GPT-Realtime-2 features enhanced reasoning abilities with a 128K context window, tool-calling reliability, and adjustable reasoning levels for complex interactions.

GPT-Realtime-Translate provides live translation across 70+ input languages and 13 output languages, while GPT-Realtime-Whisper offers streaming speech-to-text transcription.

These models support emerging voice AI patterns including voice-to-action, systems-to-voice, and voice-to-voice interactions, enabling practical applications from home search assistance to multilingual customer support.

more → · More from OpenAI →