OpenAI starts rolling out Gemini Live-like Advanced Voice Mode for ChatGPT

ChatGPT's Advanced Voice Mode utilises the GPT-4o AI model's multimodal capability to offer more natural-sounding real-time conversations with the AI chatbot

ChatGPT Advanced Voice
ChatGPT Advanced Voice
Harsh Shivam New Delhi
2 min read Last Updated : Sep 25 2024 | 4:07 PM IST
Microsoft-backed artificial intelligence startup OpenAI has announced the rollout of the Advanced Voice Mode feature for all ChatGPT Plus and Team subscribers. This new feature will allow users to engage in more natural conversations with the ChatGPT AI chatbot. It will be available within the ChatGPT app for paid members by the end of this week. Enterprise and education customers will gain access at a later date.

OpenAI has also noted that Advanced Voice Mode is currently unavailable in the European Union and select regions, including the UK, Switzerland, Iceland, Norway, and Liechtenstein.

ChatGPT has also introduced five new voices—Arbor, Maple, Sol, Spruce, and Vale—bringing the total to nine voice options.

ChatGPT Advanced Voice Mode: What is it?

OpenAI introduced Advanced Voice Mode during the launch of its GPT-4o model in May this year. The company explained that this mode facilitates more natural, real-time conversations with the AI chatbot, stating that it will “allow you to interrupt at any time and senses and responds to your emotions.”
In the current version, the voice mode operates with latencies averaging 2.8 seconds on GPT-3.5 and 5.4 seconds on GPT-4 models. This latency results from a data processing pipeline involving three separate models: one for transcribing audio to text, GPT-3.5 or GPT-4 for text processing, and another for converting text back to audio. OpenAI noted that this multi-model process can lead to a significant loss of information for GPT-4.

The GPT-4o model addresses this issue by processing all inputs and outputs—text, vision, and audio—through the same neural network. This integration reduces latency, enhances the naturalness of conversations, and improves overall performance. Additionally, GPT-4o is better equipped to handle interruptions, manage group conversations, filter out background noise, and adapt to tone.
While announcing the rollout schedule for ChatGPT Advanced Voice Mode, OpenAI stated that users will be able to set custom instructions for the Advanced Voice. Furthermore, the company has improved conversational speed, smoothness, and accents in select foreign languages.
*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

Topics :OpenAIChatGPTartifical intelligence

First Published: Sep 25 2024 | 4:07 PM IST

Next Story