Business Standard

Sunday, December 22, 2024 | 10:08 PM ISTEN Hindi

Notification Icon
userprofile IconSearch

OpenAI rolls out advanced voice mode in ChatGPT for Plus users: What is it

Explaining advanced voice mode, OpenAI said that ChatGPT will offer more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions

ChatGPT's advanced voice mode

ChatGPT's advanced voice mode

Prakruti Mishra New Delhi

Listen to This Article

Advanced voice mode is starting to roll out to a small group of ChatGPT Plus users, announced Microsoft-backed artificial intelligence startup OpenAI in a post on X. Announced in May, the feature was originally slated for release in June this year but got delayed because it needed time to reach its launch standard. Here is all you need to know about OpenAI’s advanced voice mode for ChatGPT:

ChatGPT advanced voice mode: What is it

In September 2023, OpenAI announced support for voice and image capabilities in ChatGPT. The announcement was followed by a new multimodal language model in May this year, dubbed GPT-4o, that it said will enable advanced voice mode in ChatGPT.
 

Explaining advanced capabilities in voice mode, OpenAI said that ChatGPT will “offer more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions.”

In the current version, voice mode to talk to ChatGPT works with latencies of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4) on average. This latency is the result of a data processing pipeline of three separate models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes in text and outputs text, and a third simple model converts that text back to audio. According to OpenAI, this process results in loss of lots of information to the main source of intelligence, GPT-4.

With the GPT-4o model, which the company said is trained end-to-end across text, vision, and audio, all inputs and outputs are processed by the same neural network. This lowers down the latency for natural conversational experience and improves results since all the information is processed over the same neural network. Additionally, OpenAI said that GPT-4o is more capable of handling interruptions, manages group conversations effectively, filters out background noise, and adapts to tone.

Essentially, the advanced voice mode enables conversational artificial intelligence in ChatGPT.

ChatGPT advanced voice mode: Availability

The advanced voice mode capability is currently being tested with a small batch of ChatGPT Plus users. OpenAI said the users selected in this alpha will receive an email with instructions and a message in their mobile app. OpenAI plans to add more people on a rolling basis and plan for everyone on Plus to have access in the fall.

OpenAI said that learning from this alpha will help it make the advanced voice experience safer and more enjoyable for everyone. The startup plans to share a detailed report on GPT-4o’s capabilities, limitations, and safety evaluations in early August.

Don't miss the most important news and views of the day. Get them on our Telegram channel

First Published: Jul 31 2024 | 12:28 PM IST

Explore News