Results 1 to 5 of 5

Thread: Jukebox, neural net that generates music, OpenAI Inc., San Francisco, California, USA

  1. #1

    Jukebox, neural net that generates music, OpenAI Inc., San Francisco, California, USA

    Developer - OpenAI Inc.

    openai.com/blog/jukebox

    github.com/openai/jukebox

    "Jukebox: a generative model for music"

    by Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever

  2. #2


    Jukebox: a generative model for music (paper explained)

    May 2, 2020

    This generative model for music can make entire songs with remarkable quality and consistency. It can be conditioned on genre, artist, and even lyrics.

    Abstract:
    We introduce Jukebox, a model that generates music with singing in the raw audio domain. We tackle the long context of raw audio using a multiscale VQ-VAE to compress it to discrete codes, and modeling those using autoregressive Transformers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable. We are releasing thousands of non cherry-picked samples, along with model weights and code.

    Authors: Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever

  3. #3

  4. #4


    OpenAI’s Jukebox AI writes amazing new songs

    Jun 9, 2020

  5. #5


    Prafulla Dhariwal (OpenAI) - Jukebox: a generative model for music

    Aug 12, 2020

    Prafulla Dhariwal (OpenAI)
    Jukebox: A Generative Model for Music
    Presentation recorded June 19, 2020

    Abstract: Music is an extremely challenging domain for generative modeling: it’s highly diverse, humans are perceptive to small errors, and it has extremely long range dependencies to learn if generated as raw audio. We show it’s possible to generate music with singing directly in the raw audio domain. We tackle the long sequence lengths of raw audio using a multi-scale VQ-VAE to compress it to discrete codes, and model those using autoregressive Transformers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable.

    Bio: Prafulla Dhariwal is a research scientist at OpenAI leading work on generative models under the guidance of Ilya Sutskever. His work focuses on modeling high dimensional data while preserving fidelity and diversity, with prominent works being Glow, a normalizing flow generating high resolution images with fast sampling; and Variational Lossy Auto-encoder, a way to understand and prevent latent collapse with autoregressive decoders in VAE’s. In the past, he’s also worked on reinforcement learning, including PPO, a popular on-policy RL algorithm; and GamePad, an environment to make it easier to apply RL to formal theorem proving. He obtained his undergraduate degree from MIT in 2017 with a double major in Computer Science and Mathematics.

Социальные закладки

Социальные закладки

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •