Stability AI has open-sourced a 341M parameter text-to-audio model designed to run directly on smartphones and edge devices without requiring cloud computing.
(https://ci3.googleusercontent.com/meips/ADKq_NaqB-T5tBB0-Ue6_w-ks9A7yupDG6SSSkX9Uv1gTnzmFCDY9Tf6ly5m39vxjFDf5RJNMEE7ApjPtRNWviSHZRyi03CIongDqkxRcfBQZ0qzV76AJ-XrE0hN3LGIyunNx0o-rfXDLfJJ2-mQa767ud5xr_IXRtUW_Sbo-1cKEpBuhP8Ct3_Tyhv44LMAp04pxJc2n-xnR_s_9syNogPdL-9laK7P3dAomGpwnPzJNzcmkad26iAViraoPGBTfiZFhFuNmgs=s0-d-e1-ft#https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/5343799d-799a-4c8d-bfaa-65ad2ae483a8/image.png?t=1747329998)
Why you'll love it: Unlike most AI audio tools that require powerful servers, this model can generate 11-second audio clips on a smartphone in under 8 seconds.
It's perfect for creating quick sound effects, musical samples, or ambient sounds without an internet connection. 🎧
How to use it: The model is available on GitHub for developers to integrate into apps, with demo implementations showing how to run it efficiently on Arm CPUs.