Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Hands-On Music Generation with Magenta
  • Table Of Contents Toc
Hands-On Music Generation with Magenta

Hands-On Music Generation with Magenta

By : DuBreuil
4 (3)
close
close
Hands-On Music Generation with Magenta

Hands-On Music Generation with Magenta

4 (3)
By: DuBreuil

Overview of this book

The importance of machine learning (ML) in art is growing at a rapid pace due to recent advancements in the field, and Magenta is at the forefront of this innovation. With this book, you’ll follow a hands-on approach to using ML models for music generation, learning how to integrate them into an existing music production workflow. Complete with practical examples and explanations of the theoretical background required to understand the underlying technologies, this book is the perfect starting point to begin exploring music generation. The book will help you learn how to use the models in Magenta for generating percussion sequences, monophonic and polyphonic melodies in MIDI, and instrument sounds in raw audio. Through practical examples and in-depth explanations, you’ll understand ML models such as RNNs, VAEs, and GANs. Using this knowledge, you’ll create and train your own models for advanced music generation use cases, along with preparing new datasets. Finally, you’ll get to grips with integrating Magenta with other technologies, such as digital audio workstations (DAWs), and using Magenta.js to distribute music generation apps in the browser. By the end of this book, you'll be well-versed with Magenta and have developed the skills you need to use ML models for music generation in your own style.
Table of Contents (16 chapters)
close
close
1
Section 1: Introduction to Artwork Generation
3
Section 2: Music Generation with Machine Learning
8
Section 3: Training, Learning, and Generating a Specific Style
11
Section 4: Making Your Models Interact with Other Applications

Google's Magenta and TensorFlow in music generation

Since its launch, TensorFlow has been important for the data scientist community for being An Open Source Machine Learning Framework for Everyone. Magenta, which is based on TensorFlow, can be seen the same way: even if it's using state of the art machine learning techniques, it can still be used by anyone. Musicians and computer scientists alike can install it and generate new music in no time.

In this section, we'll look at the content of Magenta by introducing what it can and cannot do and refer to the chapters that explain the content in more depth.

Creating a music generation system

Magenta is a framework for art generation, but also for attention, storytelling, and the evaluation of generative music. As the book advances, we'll come to see and understand how those elements are crucial for pleasing music generation.

Evaluating and interpreting generative models is inherently hard, especially for audio. A common criterion in machine learning is the average log-likelihood, which calculates how much the generated samples deviate from the training data, which might give you the proximity of two elements, but not the musicality of the generated one.

Even if the progress in GANs is promising in such evaluations, we are often left with only our ears to evaluate. We can also imagine a Turing test for a music piece: a composition is played to an audience that has to decide whether the piece was generated by a computer.

We'll be using Magenta for two different purposes, assisting and autonomous music creation:

  • Assisting music systems helps with the process of composing music. Examples of this would be the Magenta interface, magenta_midi.py, where the musician can enter a MIDI sequence and Magenta will answer with a generated sequence that's inspired by the provided one. These types of systems can be used alongside traditional systems to compose music and get new inspirations. We'll be talking about this in Chapter 9, Making Magenta Interact with Music Applications, where Magenta Studio can be integrated into a traditional music production tool.
  • Autonomous music systems continuously produce music without the input of an operator. At the end of this book, you'll have all the tools you'll need to build an autonomous music generation system consisting of the various building blocks of Magenta.

Looking at Magenta's content

Remembering what we saw in the previous section, there are many ways of representing music: symbolic data, spectrogram data, and raw audio data. Magenta works mainly with symbolic data, meaning we'll mainly work on the underlying score in music instead of working directly with audio. Let's look into Magenta's content, model by model.

Differentiating models, configurations, and pre-trained models

In Magenta and in this book, the term model refers to a specific deep neural network that is specific for one task. For example, the Drums RNN model is an LSTM network with attention configuration, while the MusicVAE model is a variational autoencoder network. The Melody RNN model is also an LSTM network but is geared toward generating melodies instead of percussion patterns.

Each model has different configurations that will change how the data is encoded for the network, as well as how the network is configured. For example, the Drums RNN model has a one_drum configuration, which encodes the sequence to a single class, as well as a drum_kit configuration, which maps the sequence to nine drum instruments and also configures the attention length to 32.

Finally, each configuration comes with one or more pre-trained models. For example, Magenta provides a pre-trained Drums RNN drum_kit model, as well as multiple pre-trained MusicVAE cat-drums_2bar_small models.

We'll be using this terminology throughout this book. For the first few chapters, we'll be using the Magenta pre-trained models, since they are already quite powerful. After, we'll create our own configurations and train our own models.

Generating and stylizing images

Image generation and stylization can be achieved in Magenta with the Sketch RNN and Style Transfer models, respectively. Sketch-RNN is a Sequence-to-Sequence (Seq2Seq) variational autoencoder.

Seq2Seq models are used to convert sequences from one domain into another domain (for example, to translate a sentence in English to a sentence in French) that do not necessarily have the same length, which is not possible for a traditional model structure. The network will encode the input sequence into a vector, called a latent vector, from which a decoder will try to reproduce the input sequence as closely as possible.

Image processing is not part of this book, but we'll see the usage of latent space in Chapter 4, Latent Space Interpolation with MusicVAE, when we use the MusicVAE model. If you are interested in the SketchRNN model, see the Further reading section for more information.

Generating audio

Audio generation in Magenta is done with the NSynth, a WaveNet-based autoencoder, and GANSynth models. What's interesting about WaveNet is that it is a convolutional architecture, prevalent in image applications, but seldom used in music applications, in favor of recurrent networks. Convolutional neural networks (CNNs) are mainly defined by a convolution stage, in which a filter is slid through the image, computing a feature map of the image. Different filter matrices can be used to detect different features, such as edges or curves, which are useful for image classification.

We'll see the usage of these models in Chapter 5, Audio Generation with NSynth and GANSynth.

Generating, interpolating, and transforming score

Score generation is the main part of Magenta and can be split into different categories representing the different parts of a musical score:

  • Rhythms generation: This can be done with the "Drums RNN" model, an RNN network that applies language modeling using an LSTM. Drum tracks are polyphonic by definition because multiple drums can be hit simultaneously. This model will be presented in Chapter 2, Generating Drum Sequences with Drums RNN.
  • Melody generation: Also known as monophonic generation, this can be done with the "Melody RNN" and "Improv RNN" models, which also implement the use of attention, allowing the models to learn longer dependencies. These models will be presented in Chapter 3, Generating Polyphonic Melodies.
  • Polyphonic generation: This can be done with the Polyphony RNN and Performance RNN models, where the latter also implements expressive timing (sometimes called groove, where the notes don't start and stop exactly in the grid, giving it a human fell) and dynamics (or velocity). These models will be presented in Chapter 3, Generating Polyphonic Melodies.
  • Interpolation: This can be done with the MusicVAE model, a variational autoencoder that learns the latent space of a musical sequence and can interpolate between existing sequences. This model will be presented in Chapter 4, Latent Space Interpolation with Music VAE.
  • Transformation: This can be done with the GrooVAE model, a variant of the MusicVAE model that will add groove to an existing drum performance. This model will be presented in Chapter 4, Latent Space Interpolation with Music VAE.
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Hands-On Music Generation with Magenta
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon