2 Sources
[1]
Stability AI and Arm Bring Offline Generative Audio to Smartphones
Stability AI, known for its Stable Diffusion text-to-image models, has collaborated with global semiconductor giant Arm to add generative audio AI capabilities to mobile devices. With this partnership, it has managed to run Stable Audio Open, its text-to-audio model, entirely on Arm CPUs. This involves generating sound effects, audio samples, and production elements in seconds, all on-device, and without needing an internet connection. Stability AI stated, "As generative AI becomes increasingly integral to both enterprises and professional creators alike, it's crucial that our models and workflows are easily accessible everywhere builders build and creators create, providing seamless integration into their visual media production pipelines." To address the increasing demand, the company aimed to run its models efficiently at the edge. It was a challenge to optimise the Stable Audio Open model for mobile devices. It was tested on a device with an Arm CPU, initially taking 240 seconds. With distillation of the model and using Arm's software stack, like the int8 matmul kernels from KleidiAI in ExecuTorch via XNNPack, it was able to reduce the generation time for an 11-second clip to under 8 seconds on Armv9 CPUs. This resulted in a 30x faster response time. One would require a compatible mobile device to try the capability. Considering that most smartphones today feature Arm-based CPUs, it should be accessible to all kinds of users. Stability AI also plans to bring all its models across image, video, and 3D to the edge, aiming to transform how visual media is created on mobile devices.
[2]
Stability AI optimized its audio generation model to run on Arm chips | TechCrunch
AI startup Stability AI has teamed up with chipmaker Arm to bring Stability's Stable Audio Open, an AI model that can generate audio including sound effects, to mobile devices running Arm chips. While a number of AI-powered apps can generate audio, like Suno and Udio, most rely on cloud processing, meaning that they can't be used offline. Moreover, some audio generation models were trained on copyrighted content -- posing an IP risk. Stability claims Stable Audio Open's training set is made up entirely of royalty-free audio and songs. Stable Audio Open running on Arm chips, which will be demoed at the Mobile World Congress conference in Barcelona this week, can generate a sound from a text description like, "Gentle ocean waves at sunset." Stability says that it worked with Arm to optimize and "distill" Stable Audio Open, speeding up generation times by 30x. Generating a single 11-second audio sample takes around 8 seconds on an Armv9 CPU. To be clear, the optimized Stable Audio Open model isn't available to download -- at least not yet. But in a statement, Stability CEO Prem Akkaraju hinted that Stability will work to bring its models, including Stable Audio Open, to consumer apps and devices in the future. "As more and more professional creatives and businesses adopt generative AI to power their production pipeline, it's important that our models and workflows are available everywhere for builders to build and creators to create," Akkaraju said. "We are excited to partner with Arm for this exact reason." Stability says it's collaborating with Arm to further optimize and fine-tune Stable Audio Open for mobile. Stability, the beleaguered firm behind the popular image generation model Stable Diffusion, raised new cash last year as investors including Eric Schmidt and Napster founder Sean Parker sought to turn the business around. Emad Mostaque, Stability's co-founder and ex-CEO, reportedly mismanaged Stability into financial ruin, leading staff to resign, a partnership with Canva to fall through, and investors to grow concerned about the company's prospects. In the last few months, Stability has hired a new CEO, appointed Titanic director James Cameron to its board of directors, and released several new image generation models.
Share
Copy Link
Stability AI collaborates with Arm to optimize Stable Audio Open for mobile devices, enabling offline generative audio capabilities on smartphones with Arm CPUs.
Stability AI, the company behind the popular Stable Diffusion text-to-image models, has partnered with global semiconductor giant Arm to bring generative audio AI capabilities to mobile devices. This collaboration aims to run Stability's text-to-audio model, Stable Audio Open, entirely on Arm CPUs, enabling the generation of sound effects, audio samples, and production elements directly on smartphones without an internet connection 1.
The partnership faced the challenge of optimizing the Stable Audio Open model for mobile devices. Initial tests on an Arm CPU-equipped device took 240 seconds to generate audio. However, through model distillation and the use of Arm's software stack, including int8 matmul kernels from KleidiAI in ExecuTorch via XNNPack, they achieved a significant breakthrough 1.
The optimized model can now generate an 11-second audio clip in under 8 seconds on Armv9 CPUs, resulting in a 30x faster response time. This improvement makes on-device audio generation practical for mobile users 2.
Unlike cloud-based AI-powered audio generation apps such as Suno and Udio, Stable Audio Open running on Arm chips can operate offline. This capability provides users with greater flexibility and privacy. Additionally, Stability AI claims that Stable Audio Open's training set consists entirely of royalty-free audio and songs, potentially mitigating intellectual property risks associated with some other audio generation models 2.
Stability AI CEO Prem Akkaraju emphasized the importance of making their models and workflows accessible to creators and businesses adopting generative AI in their production pipelines. The company plans to extend this approach to all its models across image, video, and 3D, aiming to transform visual media creation on mobile devices 1.
While the optimized Stable Audio Open model is not yet available for download, Stability AI and Arm are collaborating to further optimize and fine-tune it for mobile use. The technology will be demonstrated at the Mobile World Congress conference in Barcelona 2.
This partnership comes at a crucial time for Stability AI, which has faced challenges in recent months. The company has undergone significant changes, including raising new capital from investors like Eric Schmidt and Sean Parker, hiring a new CEO, and appointing Titanic director James Cameron to its board of directors. These moves are part of efforts to revitalize the company following reported mismanagement issues under its previous leadership 2.
Summarized by
Navi
[1]
NVIDIA announces significant upgrades to its GeForce NOW cloud gaming service, including RTX 5080-class performance, improved streaming quality, and an expanded game library, set to launch in September 2025.
9 Sources
Technology
11 hrs ago
9 Sources
Technology
11 hrs ago
Google's Made by Google 2025 event showcases the Pixel 10 series, featuring advanced AI capabilities, improved hardware, and ecosystem integrations. The launch includes new smartphones, wearables, and AI-driven features, positioning Google as a strong competitor in the premium device market.
4 Sources
Technology
11 hrs ago
4 Sources
Technology
11 hrs ago
Palo Alto Networks reports impressive Q4 results and forecasts robust growth for fiscal 2026, driven by AI-powered cybersecurity solutions and the strategic acquisition of CyberArk.
6 Sources
Technology
11 hrs ago
6 Sources
Technology
11 hrs ago
OpenAI updates GPT-5 to make it more approachable following user feedback, sparking debate about AI personality and user preferences.
6 Sources
Technology
19 hrs ago
6 Sources
Technology
19 hrs ago
President Trump's plan to deregulate AI development in the US faces a significant challenge from the European Union's comprehensive AI regulations, which could influence global standards and affect American tech companies' operations worldwide.
2 Sources
Policy
3 hrs ago
2 Sources
Policy
3 hrs ago