Curated by THEOUTPOST
On Wed, 2 Oct, 4:03 PM UTC
5 Sources
[1]
OpenAI lets anyone use its new voice assistant in third-party apps
OpenAI is letting businesses and developers include its real-time voice assistant in their own applications, paving the way for more users to have realistic-sounding spoken conversations with an artificial intelligence system for a wide range of tasks. The feature, which is similar to the newest voice assistant available to paying users of its ChatGPT chatbot, became available on Tuesday to those building apps and services with OpenAI's application programming interface, or API. Businesses will be able to develop software such as a voice-based customer service bot or a travel app that places phone calls on a user's behalf, OpenAI said. Some companies have already been testing out the new voice capability, including health coaching app Healthify and language learning app Speak. The third-party voice integration is one of a number of updates OpenAI announced at a developer event in San Francisco on Tuesday. The conference offered OpenAI a chance to show how it plans to stay ahead of an increasingly crowded market for AI software at a time when it's looking to close a large funding round. It also came days after several leaders, including Chief Technology Officer Mira Murati, said they planned to depart the startup -- the latest in a series of high-ranking employees to leave in recent months. While OpenAI unveiled a new AI model at its first-ever developer conference last November, the company said it won't be doing so this year. Instead, OpenAI was focusing on showing off new API capabilities and meeting with some of the 3 million developers working with its software in different parts of the world. Tuesday's event was intended to be the first of three, with others to be held in London and Singapore. At a press briefing ahead of the event, OpenAI employees demonstrated the new voice feature with a fictional travel app. Romain Huet, OpenAI's head of developer experience, asked the fake app to call a made-up business, Ilan's Strawberries, and order 400 chocolate-covered berries while keeping the budget under $1,500. Immediately, the app placed a call -- a feature made possible, it was explained, by the app's use of an API from cloud communications company Twilio -- and a phone next to Huet rang. Huet picked up and, acting as if he was a worker at the strawberry business, took an order from the voice assistant. He said it would cost about $1,200 for the strawberries. The voice used in the demo did not identify itself as an AI voice assistant -- a lack of disclosure that might concern some users as well as privacy and digital rights advocates if applied to actual apps. OpenAI said after the demo it doesn't mandate that developers inform users in any specific way when they're interacting with AI. But the company pointed out that its usage policies state developers must generally tell users when they're using AI, unless it's made obvious by the type of exchange. "There is no strict guideline of how you have to do it," said Olivier Godement, the head of product for OpenAI's API. "We just want to make sure it's obviously clear to the user." OpenAI first teased a real-time voice feature in May for ChatGPT, showing how it could quickly talk back to users in response to written and visual prompts. But the next month, OpenAI delayed launching the option to work through potential safety issues. After the delay, OpenAI said the product would not be able to impersonate how other people speak. The company also said that it had added new filters to ensure the software can spot and refuse some requests to generate music or other forms of copyrighted audio. In July, OpenAI offered the feature to a limited number of its ChatGPT Plus customers. Last month, the startup began rolling it out to all paying ChatGPT users.
[2]
OpenAI Opens New AI Voice Capability to Third-Party Developers | PYMNTS.com
OpenAI is expanding access to its real-time voice assistant technology, allowing businesses and developers to integrate the artificial intelligence (AI)-powered tool into their own applications, Bloomberg reported. This move is set to broaden the reach of realistic spoken AI interactions across a wide range of tasks and industries. The new feature, which resembles the voice assistant available to paying ChatGPT users, will be accessible for those using OpenAI's application programming interface (API). This development opens up possibilities for creating voice-based customer service bots, travel apps capable of making phone calls on behalf of users, and various other applications. Several companies have already begun testing the voice capability, including health coaching app Healthify and language learning platform Speak. These early adopters showcase the potential for diverse implementations of the technology across different sectors. The announcement came Tuesday (Oct. 1) as part of a series of updates being unveiled at OpenAI's developer event in San Francisco. This conference serves as a platform for OpenAI to demonstrate its strategy for maintaining its edge in a competitive AI software market, especially as the company seeks to secure a big funding round. The timing of the event is notable, following recent news of several high-ranking departures from OpenAI, including Chief Technology Officer Mira Murati. These exits are part of a trend of leadership changes at the startup in recent months. Unlike last year's developer conference, where OpenAI introduced a new AI model, this year's focus is on showcasing enhanced API capabilities and engaging with its growing developer community. The company reports that about 3 million developers are working with its software globally. To illustrate the new voice feature's capabilities, OpenAI employees demonstrated a fictional travel app during a press briefing. The app, leveraging OpenAI's technology in conjunction with Twilio's cloud communications API, placed a mock phone call to order chocolate-covered strawberries, showing the potential for seamless voice-based interactions in real-world scenarios. This developer event in San Francisco is the first of three planned gatherings, with additional events scheduled in London and Singapore.
[3]
OpenAI introduces new tools to fast-track building of AI voice assistants
OpenAI introduces new tools to streamline AI app development, including a real-time voice application tool and fine-tuning features for models using images and text. The enhancements, aimed at maintaining competitiveness with tech giants like Google, are expected to significantly boost OpenAI's revenue.OpenAI unveiled a host of new tools on Tuesday that would make it easier for developers to build applications based on its artificial intelligence technology, as the ChatGPT maker wrestles with tech giants to keep up in the generative AI race. The Microsoft-backed startup said a new real-time tool, rolling out immediately for testing, would allow developers to create AI voice applications using a single set of instructions. The process earlier required developers to go through at least three steps: first transcribing audio, then running the generated-text model to come up with an answer to the query and finally using a separate text-to-speech model. A large chunk of OpenAI's revenue comes from businesses that use its services to build their own AI applications, making the rollout of advanced capabilities a key selling point. Competition has also been heating up as technology giants, including Google-parent Alphabet, integrate AI models capable of crunching different forms of information such as video, audio and text across their businesses. OpenAI expects its revenue to jump to $11.6 billion next year from an estimated $3.7 billion in 2024, Reuters reported last month. The company is also in the middle of a $6.5 billion fundraise that could value it at $150 billion. As part of Tuesday's rollout, OpenAI introduced a fine-tuning tool for models after training that would allow developers to improve the responses generated by models using images and text. This fine-tuning process can include feedback from humans who feed the model examples of good and bad answers based on its responses. Using images to fine-tune models would give them stronger image understanding capabilities, enabling applications such as enhanced visual search and improved object detection for autonomous vehicles, OpenAI said. The startup also unveiled a tool that would allow smaller models to learn from larger ones, along with "Prompt Caching" that cuts some development costs by half by reusing pieces of the text AI has previously processed. (Reporting by Arsheeya Bajwa in Bengaluru and Anna Tong in San Francisco; Editing by Shilpi Majumdar and Alan Barona)
[4]
OpenAI makes it easy to develop applications based on their AI: these are their new tools - Softonic
The company has introduced several features with the aim of facilitating the creation of AI applications OpenAI, the company behind the popular ChatGPT, has taken a significant step in the artificial intelligence race with the launch of new tools aimed at developers. This Tuesday, the startup introduced several features with the goal of facilitating the creation of AI applications, in a context of intense competition where tech giants like Microsoft and Google are also integrating artificial intelligence into their services. One of the main innovations is a real-time tool for developing voice applications with artificial intelligence. Previously, the process required three steps: transcribing the audio, generating a text response, and converting that response into voice. Now, everything is done with a single set of instructions, simplifying and speeding up the development of these applications. Among the new tools, a fine-tuning option for models after training also stands out. This system allows developers to improve AI responses using images and text, offering more precise applications such as enhanced object detection or more advanced visual search. Additionally, OpenAI has introduced "Prompt Caching," which reduces costs by reusing previously processed text snippets. OpenAI has found an important source of revenue in companies that use its services to create their own applications. These new advanced capabilities not only enhance the offering but are crucial in the company's sales strategy. The startup also aims to consolidate its position in a market where more and more companies are developing AI models that combine various forms of information, such as text, video, and audio. OpenAI also expects its revenues to grow significantly. The company could reach $11.6 billion in 2025, compared to the $3.7 billion projected for 2024, according to its estimates. Additionally, the company is in the process of raising $6.5 billion, which could be valued at around $150 billion.
[5]
OpenAI's new updates enhance voice and vision capabilities of AI
Artificial intelligence developer OpenAI entered October with several updates to its models, helping its AI models engage in better conversations and improve image recognition. On Oct. 1, OpenAI unveiled four updates that introduce new tools designed to make it easier for developers to build on its AI models. It speaks! One major update is the Realtime API, which allows developers to create AI-generated voice applications using a single prompt. The tool, available for testing, supports low-latency, multimodal experiences by streaming audio inputs and outputs, enabling natural conversations similar to ChatGPT's Advanced Voice Mode. Previously, developers had to "stitch together" multiple models to create these experiences. Audio input would typically need to be fully uploaded and processed before receiving a response, which meant higher latency for real-time applications like speech-to-speech conversations. Related: Apple, Google to use AI to maintain dominance -- Cathie Wood's ARK Invest With Realtime API's streaming capability, developers can now enable immediate, natural interactions, much like voice assistants. The API runs on GPT-4, released in May 2024, which can reason across audio, vision and text in real time. AI can see clearly now Another update includes a fine-tuning tool for developers, allowing them to improve AI responses generated from images and text inputs. The image-based fine tuners enable the artificial intelligence to have a better capacity to understand images, in turn enhancing visual search and object detection capabilities, according to the developer. The process includes feedback from humans who provide examples of good and bad responses. In addition to its voice and vision updates, OpenAI also rolled out "model distillation" and "prompt caching," which allow smaller models to learn from larger ones and reduce development costs and time by reusing already processed text. The advanced capabilities of its models are a key selling point, as a major chunk of revenue for OpenAI comes from businesses building their own applications on top of OpenAI's technology. According to Reuters, OpenAI projects its revenue to rise to $11.6 billion next year, up from an estimated $3.7 billion in 2024.
Share
Share
Copy Link
OpenAI introduces a suite of new tools for developers, including real-time voice capabilities and improved image processing, aimed at simplifying AI application development and maintaining its competitive edge in the AI market.
OpenAI, the company behind ChatGPT, has unveiled a series of new tools and updates aimed at enhancing AI application development. These innovations, announced at a developer event in San Francisco, are designed to streamline the creation of AI-powered applications and maintain OpenAI's competitive edge in the rapidly evolving AI market [1][2].
A key highlight of the announcement is the introduction of a real-time voice assistant tool. This feature allows developers to integrate OpenAI's voice technology into their applications using a single set of instructions, significantly simplifying the process [1][3]. Previously, developers had to navigate through multiple steps involving audio transcription, text generation, and text-to-speech conversion [5].
The new capability enables the creation of various applications, including:
OpenAI has also introduced a fine-tuning tool for models that work with images and text. This feature allows developers to improve AI responses by incorporating human feedback, enhancing the model's image understanding capabilities [3][5]. The improvements could lead to:
Other notable updates include:
These updates come at a crucial time for OpenAI as it faces increasing competition from tech giants like Google and Microsoft in the AI space [3][4]. The company's focus on empowering developers aligns with its business model, as a significant portion of OpenAI's revenue comes from businesses using its services to build their own AI applications [1][3].
OpenAI's strategic moves are backed by ambitious financial projections:
The introduction of these tools is expected to accelerate AI integration across various sectors, from customer service to autonomous vehicles. As OpenAI continues to innovate and expand its offerings, the AI landscape is likely to see rapid advancements in voice technology, image processing, and multimodal AI applications [1][2][5].
With these updates, OpenAI aims to solidify its position as a leader in the AI industry, providing developers with powerful tools to create more sophisticated and user-friendly AI applications. The company's focus on simplifying complex AI processes for developers could potentially lead to a new wave of AI-powered innovations across multiple industries [4][5].
Reference
[1]
[4]
[5]
OpenAI has begun rolling out its highly anticipated voice assistant to select ChatGPT Plus subscribers. The launch comes after a delay to address safety issues, marking a significant advancement in AI-powered voice technology.
5 Sources
OpenAI has finally released its advanced voice feature for ChatGPT Plus and Team users, allowing for more natural conversations with the AI. The feature was initially paused due to concerns over potential misuse.
14 Sources
OpenAI introduces Realtime API, potentially revolutionizing smart speaker technology with advanced voice features, real-time interactions, and more natural conversations.
2 Sources
OpenAI's DevDay 2024 unveiled groundbreaking updates to its API services, including real-time voice interactions, vision fine-tuning, prompt caching, and model distillation techniques. These advancements aim to enhance developer capabilities and unlock new possibilities in AI-powered applications.
5 Sources
OpenAI has rolled out an advanced voice mode for ChatGPT, allowing users to engage in verbal conversations with the AI. This feature is being gradually introduced to paid subscribers, starting with Plus and Enterprise users in the United States.
12 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved