• All articles
  • Language models
  • New Tech
  • Safety, Regulation & Ethics
  • Company tracker
    • Apple
    • Google
    • Meta
    • OpenAI
No Result
View All Result
  • English
    • All articles
    • Language models
    • New Tech
    • Safety, Regulation & Ethics
    • Company tracker
      • Apple
      • Google
      • Meta
      • OpenAI
    No Result
    View All Result
    Daily AI Watch
    No Result
    View All Result
    Home New Tech

    Stability AI Unveils Game-Changing Audio Generation Model

    "Stable Audio" Promises to Revolutionize Audio Generation with Unprecedented Control and Efficiency

    Daily AI Watch by Daily AI Watch
    17. September 2023
    0 0
    Google and Universal Music Collaborate on AI-Generated Songs
    1
    VIEWS
    Share on FacebookShare on Twitter

    Key Points:

    • Stability AI has unveiled “Stable Audio,” a latent diffusion model designed to revolutionize audio generation.
    • The model offers unprecedented control over content and length of generated audio, including the creation of complete songs.
    • Stable Audio uses advanced diffusion sampling techniques for rapid generation of high-quality audio.

    Innovative Audio Generation with Stable Audio
    Stability AI has introduced “Stable Audio,” a groundbreaking latent diffusion model that promises to transform the field of audio generation. This model combines text metadata, audio duration, and start time conditioning to provide unparalleled control over the content and length of generated audio. It even enables the creation of complete songs, addressing the limitations of traditional audio diffusion models that struggled with generating audio of fixed durations.

    Accelerated Inference and High-Quality Output
    One of the standout features of Stable Audio is its heavily downsampled latent representation of audio, which significantly accelerates inference times compared to raw audio. The flagship Stable Audio model can generate 95 seconds of stereo audio at a 44.1 kHz sample rate in under a second using an NVIDIA A100 GPU. This efficiency is achieved through cutting-edge diffusion sampling techniques.

    Core Architecture and Training of Stable Audio
    The core architecture of Stable Audio includes a variational autoencoder (VAE), a text encoder, and a U-Net-based conditioned diffusion model. The VAE compresses stereo audio into a noise-resistant, lossy latent encoding, expediting generation and training processes. The text encoder, derived from a CLAP model, imbues text features with information about the relationships between words and sounds. During training, the model learns to incorporate key properties from audio chunks, allowing users to specify the desired length of the generated audio during inference.

    Extensive Dataset and Future Developments
    To train the flagship Stable Audio model, Stability AI curated an extensive dataset comprising over 800,000 audio files, amounting to 19,500 hours of audio. The team at Stability AI’s generative audio research lab, Harmonai, remains dedicated to advancing model architectures and refining datasets. They hint at forthcoming releases, including open-source models based on Stable Audio and accessible training code.


    Food for Thought:

    1. How will Stable Audio’s advanced capabilities impact the future of audio generation and creative industries?
    2. What are the potential applications and implications of using latent diffusion models in audio creation?
    3. How might the development of open-source models based on Stable Audio influence the broader AI and audio technology community?

    Let us know what you think in the comments below!


    Author and Source: Article by Ryan Daws on Artificial Intelligence News.

    Disclaimer: Summary written by ChatGPT.

    author avatar
    Daily AI Watch
    See Full Bio
    Tags: AI NewsAudio generationGenerative AIStability AI
    Next Post
    Elon Musk

    Elon Musk Leads Tech Giants in Calling for AI Regulation

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    Recommended.

    Klarna, AI News, AI Assistant

    Klarna: AI Powered Customer Service (Revolution?)

    6. March 2024
    AI and Robots: Revolutionising the Future of Materials Science

    AI and Robots: Revolutionising the Future of Materials Science

    30. November 2023

    Trending.

    Devin, AI News, LLM, Assistant

    AI Software Engineer Devin Revolutionizes Coding

    13. March 2024
    Hugging Face and IBM Collaborate on the Next-Gen AI Studio, Watsonx.ai

    AI’s Role in Disaster Relief: A Case Study of Turkey and Syria Earthquakes

    18. August 2023
    A Guide to Leveraging Large Language Models on Private Data

    A Guide to Leveraging Large Language Models on Private Data

    25. August 2023
    Job replacement, AI News, White collar

    AI Impact on White-Collar Jobs

    13. February 2024
    Apple, OpenAI

    Apple Plans AI Features in iOS 18 Amid OpenAI Partnership

    28. May 2024
    • About us
    • Archive
    • Cookie Policy (EU)
    • Home
    • Terms & Conditions
    • Zásady ochrany osobných údajov

    © 2023 Lumina AI s.r.o.

    No Result
    View All Result
    • All articles
    • Language models
    • New Tech
    • Safety, Regulation & Ethics
    • Company tracker
      • Apple
      • Google
      • Meta
      • OpenAI

    © 2023 Lumina AI s.r.o.

    Welcome Back!

    Sign In with Google
    OR

    Login to your account below

    Forgotten Password?

    Retrieve your password

    Please enter your username or email address to reset your password.

    Log In
    Manage cookie consent
    We use technologies like cookies to store and/or access device information. We do this to improve browsing experience and to show (non-) personalized ads. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
    Functional Always active
    Technical storage or access is absolutely necessary for the legitimate purpose of enabling the use of a specific service that the participant or user has expressly requested, or for the sole purpose of carrying out the transmission of communication over an electronic communication network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    A technical repository or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    Technical storage or access is necessary to create user profiles to send advertising or track a user on a website or across websites for similar marketing purposes.
    Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
    Show preferences
    {title} {title} {title}
    Are you sure want to unlock this post?
    Unlock left : 0
    Are you sure want to cancel subscription?