OpenAI is rolling out new voice features for its ChatGPT chatbot to a select group of ChatGPT Plus subscribers in an early alpha trial, the company announced on X on Tuesday. This advanced voice mode was initially previewed during OpenAI's Spring Update in May, where it also introduced the GPT-4o model.
Subscribers with early access have taken to social media to share their experiences, showcasing the chatbot's capabilities in assisting with French pronunciations, mimicking an airline pilot, and imitating seven U.S. regional dialects. While the New York and Midwestern accents still need refinement, the chatbot's knowledge of New Yorkers folding their pizza is on point.
OpenAI's move to enhance its voice functionality comes as other tech giants also vie for dominance in the generative AI market, which is projected to reach $1.3 trillion by 2023. Google plans to release its conversational Gemini chatbot for Gemini Advanced subscribers, and Meta's Meta AI chatbot already interacts with users through Ray-Ban glasses.
The advanced voice mode in ChatGPT aims to enable more natural, real-time conversations, responding to user emotions and allowing interruptions. Users can activate the feature with the command, "Hey, ChatGPT."
Details on the full capabilities of this mode remain limited, as OpenAI has not provided a comprehensive outline. However, subscribers in the alpha test will receive instructions via the ChatGPT app and email. This trial phase is intended to monitor usage and enhance the model's capabilities and safety before a broader rollout.
OpenAI plans to extend access to more subscribers over the coming weeks, with a goal to make advanced voice functionality available to all Plus members by the fall. Plus subscribers benefit from early access to new features, an always-on connection, and unlimited access to GPT-4o. In contrast, free version users may be downgraded to the smaller GPT-4o mini model during high traffic.
Voice functionality was first introduced in ChatGPT in September 2023. The new advanced voice mode includes four preset voices—Breeze, Cove, Ember, and Juniper—developed with voice actors in 2023. A fifth voice, Sky, was paused following a complaint from actress Scarlett Johansson, whose voice in the movie "Her" resembled the ChatGPT voice. CEO Sam Altman issued an apology, clarifying that the resemblance was unintentional.
In a blog post, OpenAI emphasized its selection of voice actors from diverse backgrounds, aiming for voices that are timeless, approachable, trustworthy, warm, engaging, and easy to listen to. OpenAI assured that ChatGPT cannot impersonate voices and has implemented filters to block requests for generating copyrighted audio.