2 Sources
[1]
Stability AI releases an audio-generating model that can run on smartphones | TechCrunch
AI startup Stability AI has released Stable Audio Open Small, a "stereo" audio-generating AI model that the company claims is the fastest on the market -- and efficient enough to run on smartphones. Stable Audio Open Small is the fruit of a collaboration between Stability AI and Arm, the chipmaker that produces many of the processors inside tablets, phones, and other mobile devices. While a number of AI-powered apps can generate audio, like Suno and Udio, most rely on cloud processing, meaning that they can't be used offline. Stability also claims that Stable Audio Open Small's training set is made up entirely of songs from the royalty-free audio libraries Free Music Archive and Freesound. That's as opposed to the training sets of the aforementioned Suno and Udio, which reportedly contain copyrighted content, posing an IP risk. Stable Audio Open Small is 341 million parameters in size and optimized to run on Arm CPUs. (Parameters, sometimes referred to as weights, are the internal components of a model that guide its behavior.) Designed for quickly generating short audio samples and sound effects (e.g., drum and instrument riffs), Stable Audio Open Small can produce up to 11 seconds of audio on a smartphone in less than 8 seconds, claims Stability AI. Here's a sample generated by Stable Audio Open Small: And here's another one: The model isn't without its limitations. Stable Audio Open Small only supports prompts written in English, and Stability notes in its documentation that the model can't generate realistic vocals or high-quality songs. The model also doesn't perform equally well across musical styles, Stability warns -- a consequence of its Western-biased training data. In another potential wrinkle for devs, Stable Audio Open Small has somewhat restrictive usage terms. It's free to use for researchers, hobbyists, and businesses with less than $1 million in annual revenue, but developers and organizations making over $1 million in revenue have to pay for Stability's enterprise license. Stability, the beleaguered firm behind the popular image generation model Stable Diffusion, raised new cash last year as investors, including Eric Schmidt and Napster founder Sean Parker, sought to turn the business around. Emad Mostaque, Stability's co-founder and ex-CEO, reportedly mismanaged Stability into financial ruin, leading staff to resign, a partnership with Canva to fall through, and investors to grow concerned about the company's prospects. In the last few months, Stability has hired a new CEO, appointed Titanic director James Cameron to its board of directors, and released several new image generation models.
[2]
Stability AI, Arm Unveil Lightweight Audio Model That Runs On-Device
Stability AI developed a new text-to-audio generation artificial intelligence (AI) model in partnership with Arm. Announced on Wednesday, the new model is dubbed Stable Audio Open Small, and it is said to generate short audio samples using text prompts. The London-based AI firm said that the model is lightweight and is optimised to run entirely on Arm CPUs. It is also said to have a fast generation time, making it useful for bulk use cases. The open-source audio model is available to download from GitHub and Hugging Face. In a newsroom post, the AI firm detailed the new large language model. It is a distilled version of the Stable Audio Open model, which was released in June 2024, and can generate up to 47 seconds of audio. The smaller text-to-audio model was designed with a focus on faster generation speed and smaller size. The Stable Audio Open Small is a 341 million parameter model that can generate up to 11 seconds of audio. The company claims that it can generate an audio sample in less than eight seconds while running locally on a smartphone. Interestingly, Stability AI and Arm announced their collaboration for generative audio creation at Mobile World Congress (MWC) 2025. Coming to the architecture and training, the Stable Audio Open Small is a latent diffusion model based on a transformer architecture. It is trained on a dataset of 4,86,492 audio recordings. The company said that all audio files are licensed. For text conditioning, a publicly available pre-trained T5 model was used. The AI firm used the Adversarial Relativistic-Contrastive (ARC) algorithm in the post-training phase to improve prompt adherence and increase the inference speed. As per the company, this text-to-audio model is suited for creating drum loops, foley, instrument riffs, and ambient textures. Due to its small size, it can be deployed on Arm-powered smartphones as well as edge devices. The model can also be used in scenarios where real-time generation and responsiveness matter. Stable Audio Open Small's model weights can be downloaded on the AI firm's Hugging Face listing, and the code base can be found on the GitHub listing. The AI model is available for commercial and non-commercial use under the permissive Stability AI Community Licence.
Share
Copy Link
Stability AI, in collaboration with Arm, has released Stable Audio Open Small, a compact and efficient audio-generating AI model capable of running on smartphones and other mobile devices.
Stability AI, the AI startup behind the popular image generation model Stable Diffusion, has unveiled its latest innovation in collaboration with chipmaker Arm. The new product, Stable Audio Open Small, is a lightweight audio-generating AI model designed to run efficiently on smartphones and other mobile devices 1.
Stable Audio Open Small is a 341 million parameter model optimized for Arm CPUs. It can generate up to 11 seconds of audio in less than 8 seconds, even when running locally on a smartphone 1. The model is particularly adept at creating short audio samples and sound effects, such as drum loops, instrument riffs, and ambient textures 2.
The model is based on a latent diffusion architecture using a transformer. It was trained on a dataset of 486,492 audio recordings, all of which are licensed. For text conditioning, a publicly available pre-trained T5 model was utilized. Stability AI also employed the Adversarial Relativistic-Contrastive (ARC) algorithm in post-training to enhance prompt adherence and increase inference speed 2.
What sets Stable Audio Open Small apart is its ability to run offline, unlike many other AI-powered audio generation apps that rely on cloud processing. This feature allows for use in scenarios where internet connectivity is unavailable or real-time generation and responsiveness are crucial 12.
Stability AI claims that Stable Audio Open Small's training set consists entirely of songs from royalty-free audio libraries, specifically the Free Music Archive and Freesound. This approach potentially mitigates intellectual property risks associated with using copyrighted content in training data 1.
However, the model does have limitations. It only supports prompts in English and cannot generate realistic vocals or high-quality songs. Additionally, its performance varies across musical styles, likely due to Western-biased training data 1.
The model weights are available for download on Stability AI's Hugging Face listing, with the code base accessible on GitHub. It's released under the Stability AI Community License, allowing both commercial and non-commercial use. However, there are some restrictions: while free for researchers, hobbyists, and businesses with less than $1 million in annual revenue, larger organizations need to purchase an enterprise license 12.
This release comes at a crucial time for Stability AI. The company recently faced challenges, including financial difficulties and leadership changes. However, it has since appointed a new CEO, added "Titanic" director James Cameron to its board, and released several new image generation models 1.
The collaboration with Arm, announced at Mobile World Congress 2025, represents a strategic move into the mobile AI space, potentially opening new avenues for on-device AI applications 2.
Salesforce CEO Marc Benioff reveals that AI is now responsible for 30-50% of the company's work, signaling a significant shift in how tech companies are integrating AI into their operations and workforce management.
7 Sources
Technology
4 hrs ago
7 Sources
Technology
4 hrs ago
Microsoft and OpenAI are in a dispute over a contractual clause regarding access to Artificial General Intelligence (AGI), highlighting tensions in their partnership as OpenAI seeks to transition into a public-benefit corporation.
6 Sources
Technology
20 hrs ago
6 Sources
Technology
20 hrs ago
A new report suggests that the ambitious climate pledges of major tech companies are becoming increasingly unrealistic due to the surge in energy consumption driven by AI development and data center expansion.
5 Sources
Technology
12 hrs ago
5 Sources
Technology
12 hrs ago
YouTube rolls out AI-generated search results carousel and expands conversational AI tool, mirroring Google's AI Overviews, potentially impacting creator engagement and user experience.
10 Sources
Technology
4 hrs ago
10 Sources
Technology
4 hrs ago
Amazon's AWS has lost its vice president overseeing generative AI development, Vasi Philomin, as competition for AI talent intensifies in the tech industry. This departure comes as Amazon strives to strengthen its position in AI development against rivals like OpenAI and Google.
6 Sources
Technology
4 hrs ago
6 Sources
Technology
4 hrs ago