2 Sources
2 Sources
[1]
Report: OpenAI plans to launch new audio model in the first quarter - SiliconANGLE
Report: OpenAI plans to launch new audio model in the first quarter OpenAI Group PBC is reportedly developing a new artificial intelligence model optimized for audio generation tasks. The Information today cited sources as saying that the algorithm will launch by the end of March. According to the publication, it's expected to produce more natural-sounding speech than OpenAI's current models. The AI will also be better at handling real-time back-and-forth interactions with users. OpenAI will reportedly base the model on a new architecture. The company's current flagship real-time audio model, GPT-realtime, uses the ubiquitous transformer architecture. It's unclear whether the company will pivot to an entirely different algorithm design or simply adopt a new transformer implementation. Some transformer-based audio models process speech directly. Others, such as the Whisper algorithm that OpenAI released in 2022, turn audio files into graphs called spectrograms before processing them. Whisper and the company's newer audio models are all available in multiple editions with varying output quality. It's possible OpenAI will also offer multiple versions of the algorithm it's expected to release this quarter. The company has reportedly combined several engineering, product and research teams to support its audio model push. The initiative is said to be led by Kundan Kumar, a former researcher at venture-backed AI provider Character.AI Inc. Many of the startup's other staffers joined Google LLC in late 2024 as part of a $2.7 billion reverse acquihire. It's possible OpenAI's upcoming model will not focus solely on speech generation use cases. The nascent AI-generated music segment is currently experiencing rapid growth: the Wall Street Journal recently reported that one market player, startup Suno Inc., is generating more than $200 million in annual revenue. Joining the fray may help OpenAI boost its consumer business. The upcoming audio model is part of a broader effort on the company's part to enter the consumer electronics market. According to The Information, OpenAI plans to launch an "audio-first personal device" in about a year. It's believed the company could eventually introduce an entire portfolio of devices complete with a smart speaker and smart glasses. Last May, OpenAI acquired product design startup io Products Inc. to support its consumer hardware push. The transaction valued the Jony Ive-founded startup at $6.5 billion. In October, the Financial Times reported that Ive is working on a smartphone-sized device that is designed to sit on a desk or table. OpenAI may seek to develop a lightweight, on-device audio model to support its move into consumer hardware. Processing prompts locally is more cost-efficient than sending them to the cloud. Google has taken a similar approach with its Pixel smartphone series, which uses an on-device model called Gemini Nano to power some AI features.
[2]
OpenAI ramps up audio AI efforts ahead of new device launch: Report
A tipster recently shared that OpenAI and Ive's are working on a device that could be a pen or "a 'to-go' audio device. OpenAI is reportedly accelerating its work on audio AI as it prepares to launch a new personal device expected in about a year. According to Information, the company has merged several engineering, product, and research teams over the past two months to overhaul its audio models, signalling a major shift toward voice-first technology. At the centre of this push is a new audio model planned for early 2026. The model is expected to sound more natural, deal with interruptions smoothly, and even speak while a user is talking, which today's AI systems still struggle with. OpenAI is also said to be exploring a lineup of devices, possibly including screenless speakers or smart glasses, designed to act more like companions than traditional gadgets. Also read: Apple iPhone 17 Pro Max price drops by over Rs 16,000: How to grab this deal This move fits into a much broader trend across the tech industry. Many companies now believe audio will become the main way people interact with technology, with screens fading into the background. Voice assistants are already common, with smart speakers used in more than a third of US homes, reports TechCrunch. Former Apple design chief Jony Ive, now involved in OpenAI's hardware efforts through the company's acquisition of his firm io, sees audio-first design as a way to reduce screen addiction and "right the wrongs" of past consumer devices. Also read: Instagram head Adam Mosseri says era of believing images is over as AI advances In related news, a tipster recently shared that OpenAI and Ive's are working on a device that could be a pen or "a 'to-go' audio device." The tipster also claimed that the gadget will likely be manufactured by Foxconn, the same company that builds iPhones.
Share
Share
Copy Link
OpenAI is developing a new audio model expected to launch by the end of March 2026, featuring more natural-sounding speech and improved real-time interactions. The company has merged multiple teams to support this effort, led by former Character.AI researcher Kundan Kumar. This push comes ahead of an audio-first personal device planned for launch in about a year, signaling OpenAI's major shift toward voice-first technology and the consumer electronics market.
OpenAI is developing a new audio model optimized for audio generation tasks, with sources indicating the algorithm will launch by the end of March 2026, according to The Information
1
. The new audio model is expected to produce more natural-sounding speech than the company's current models and handle real-time interactions more effectively, including the ability to speak while a user is talking—a capability today's AI systems still struggle with2
.
Source: SiliconANGLE
The upcoming model will reportedly be based on a new architecture, marking a departure from OpenAI's current flagship real-time audio model, GPT-realtime, which uses the transformer architecture
1
. While it remains unclear whether the company will pivot to an entirely different algorithm design or adopt a new transformer implementation, the move suggests significant technical innovation. Some transformer-based audio AI systems process speech directly, while others, such as Whisper—which OpenAI released in 2022—convert audio files into spectrograms before processing1
. The company may offer multiple versions of the new algorithm with varying output quality, similar to its existing audio models.OpenAI has merged several engineering, product, and research teams over the past two months to overhaul its audio models, signaling a major shift toward voice-first technology
2
. The initiative is led by Kundan Kumar, a former researcher at venture-backed AI provider Character.AI Inc., whose other staffers joined Google LLC in late 2024 as part of a $2.7 billion reverse acquihire1
. This consolidation reflects the company's commitment to building companion-like devices that prioritize audio over traditional screen-based interactions.
Source: Digit
The audio model development is part of OpenAI's broader effort to enter the consumer electronics market. The company plans to launch an audio-first personal device in about a year, with speculation that it could eventually introduce an entire portfolio including smart speakers and smart glasses
1
. Last May, OpenAI acquired product design startup io Products Inc. to support its consumer hardware push, valuing the Jony Ive-founded startup at $6.5 billion1
. The Financial Times reported in October that Ive is working on a smartphone-sized device designed to sit on a desk or table1
.Related Stories
This move fits a broader industry trend where companies increasingly believe audio will become the primary way people interact with technology. Voice assistants are already common, with smart speakers used in more than a third of US homes
2
. Former Apple design chief Jony Ive sees audio-first design as a way to reduce screen addiction and "right the wrongs" of past consumer devices . A recent tip suggests the device could be a pen or a "to-go" audio device, likely manufactured by Foxconn, the same company that builds iPhones2
.OpenAI's upcoming model may not focus solely on speech generation use cases. The nascent AI-generated music segment is experiencing rapid growth, with the Wall Street Journal reporting that startup Suno Inc. is generating more than $200 million in annual revenue
1
. Entering this market could help OpenAI boost its consumer business. The company may also develop a lightweight, on-device audio model to support its hardware ambitions, as processing prompts locally is more cost-efficient than cloud-based processing—an approach Google has taken with Gemini Nano on its Pixel smartphone series1
.Summarized by
Navi
19 Sept 2025•Technology

04 Feb 2025•Technology

21 Mar 2025•Technology

1
Business and Economy

2
Policy and Regulation

3
Technology
