OpenAI´s new voice models transcribe as you speak

OpenAI has introduced three new realtime voice models designed for voice apps, translation and live transcription. The lineup includes GPT-Realtime-2 for advanced conversational reasoning, GPT-Realtime-Translate for live speech translation and GPT-Realtime-Whisper for low-latency speech-to-text transcription.

GPT-Realtime-2 is powered by GPT-5-class reasoning and can naturally handle interruptions, corrections and complex conversations. Meanwhile, GPT-Realtime-Translate supports more than 70 input languages and 13 output languages, while GPT-Realtime-Whisper delivers real-time transcription for captions, meeting notes and other live applications.

All three models are now available through OpenAI’s Realtime API, with developers able to test them in the Playground or integrate them into apps using Codex.

Source: 9to5Mac

Leave a Reply

Your email address will not be published. Required fields are marked *