# Topics > Arts > Music >  Jukebox, neural net that generates music, OpenAI Inc., San Francisco, California, USA

## Airicist

Developer - OpenAI Inc.

openai.com/blog/jukebox

github.com/openai/jukebox

"Jukebox: a generative model for music"

by Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever

----------


## Airicist

Jukebox: a generative model for music (paper explained)

May 2, 2020




> This generative model for music can make entire songs with remarkable quality and consistency. It can be conditioned on genre, artist, and even lyrics.
> 
> Abstract:
> We introduce Jukebox, a model that generates music with singing in the raw audio domain. We tackle the long context of raw audio using a multiscale VQ-VAE to compress it to discrete codes, and modeling those using autoregressive Transformers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable. We are releasing thousands of non cherry-picked samples, along with model weights and code.
> 
> Authors: Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever

----------


## Airicist

Article "This AI is creating some surprisingly good bops based on music by Katy Perry and Kanye West — listen to some of the best"

by Paige Leskin
May 4, 2020

----------


## Airicist

OpenAI’s Jukebox AI writes amazing new songs

Jun 9, 2020

----------


## Airicist

Prafulla Dhariwal (OpenAI) - Jukebox: a generative model for music

Aug 12, 2020




> Prafulla Dhariwal (OpenAI)
> Jukebox: A Generative Model for Music
> Presentation recorded June 19, 2020
> 
> Abstract: Music is an extremely challenging domain for generative modeling: it’s highly diverse, humans are perceptive to small errors, and it has extremely long range dependencies to learn if generated as raw audio. We show it’s possible to generate music with singing directly in the raw audio domain. We tackle the long sequence lengths of raw audio using a multi-scale VQ-VAE to compress it to discrete codes, and model those using autoregressive Transformers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable. 
> 
> Bio: Prafulla Dhariwal is a research scientist at OpenAI leading work on generative models under the guidance of Ilya Sutskever. His work focuses on modeling high dimensional data while preserving fidelity and diversity, with prominent works being Glow, a normalizing flow generating high resolution images with fast sampling; and Variational Lossy Auto-encoder, a way to understand and prevent latent collapse with autoregressive decoders in VAE’s. In the past, he’s also worked on reinforcement learning, including PPO, a popular on-policy RL algorithm; and GamePad, an environment to make it easier to apply RL to formal theorem proving. He obtained his undergraduate degree from MIT in 2017 with a double major in Computer Science and Mathematics.

----------

