Americas

  • United States

Asia

ChatGPT users speechless over delays

news
Jun 26, 20244 mins
Artificial IntelligenceGenerative AIVoice Assistants

OpenAI has delayed an alpha release of its new voice mode for ChatGPT, citing safety and scalability concerns

ChatGPT
Credit: Ascannio / Shutterstock

OpenAI has delayed the release of ChatGPT’s much-anticipated new Voice Mode feature, saying it needs another month” to refine the technology before offering it to a limited group of users in an alpha test.

“We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch,” the  company said in social media platform X.

It said it “needs one more month to reach our bar to launch.”

OpenAI was more optimistic back in May, when it showcased Voice Mode during the Spring Update event at which it launched the faster and more capable GPT-4o large language model.

“We plan to launch a new Voice Mode with these new capabilities in an alpha in the coming weeks, with early access for Plus users as we roll out more broadly,” it said then, referring to users of its $20/month ChatGPT Plus subscription service.

With the introduction of GPT-4o, OpenAI said it was able to cut the voice response to time around 320 milliseconds, from 5.4 seconds for GPT-4, creating a more natural and real-time conversational experience.

Safety and scalability concerns take center stage

OpenAI gave two reasons for the launch delay: safety and scalability.

It emphasized its commitment to responsible AI development and the need for the model to effectively “detect and refuse certain content.” This suggests concerns about potential misuse of the technology for generating harmful or offensive speech.

Scalability also appears to be a hurdle. OpenAI said it aims to ensure the feature functions smoothly for millions of users while maintaining real-time responsiveness. This requires robust infrastructure capable of handling the increased processing demands.

“Exact timelines depend on meeting our high safety and reliability bar,” the company added in the post. “We are also working on rolling out the new video and screen sharing capabilities we demoed separately, and will keep you posted on that timeline.”

More competition for ChatGPT

OpenAI’s delay in Voice Mode rollout creates an interesting scenario in the burgeoning field of AI voice capabilities.

Competitors like Anthropic, with its Claude 3.5 Sonnet model, have already showcased voice-enabled interaction during demos.

Similarly, Google’s AI research arm, DeepMind, has been making strides in voice-based AI with its LaMDA language model,.

“Anthropic has joined this year’s intense AI race with models designed to compete head-on with recent announcements from OpenAI and Google,” said Neil Shah, VP for research and partner at Counterpoint Research. “Generative AI is a blue ocean opportunity, and each company, including Anthropic and OpenAI, will need to target specific use cases and segments. Anthropic, for example, is focusing on coding, writing, and workflow optimization.”

Beyond dedicated AI models, large language models such as Bard (Google AI) and Jurassic-1 Jumbo (AI21 Labs) are also constantly evolving, with some incorporating basic functionalities for voice interaction and response generation.

Even Microsoft’s Copilot programming assistant has begun to integrate voice-based guidance for developers.

OpenAI’s iterative approach: safety first

OpenAI’s decision to prioritize safety and scalability reflects a cautious yet responsible approach. Launching a powerful voice-enabled AI requires careful consideration of potential risks and ensuring the technology can handle widespread use without compromising performance.

“As part of our iterative deployment strategy, we’ll start the alpha with a small group of users to gather feedback and expand based on what we learn,” said the company.  

This iterative approach allows them to refine the model based on real-world user interactions and mitigate potential issues before a wider release.

While the delay may disappoint some users eager to experience Voice Mode, it does show a certain caution in the face of recent criticism of OpenAI’s attitude to safety. It has been working to restore confidence in that area with a series of appointments to its new safety and security committee.

Gyana is a contributing writer.