248 — MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

Dong & Hsiao et al (1709.06298)

Read on 25 April 2018
#music  #deep-learning  #GAN  #machine-learning  #multi-track  #polyphony  #time-series  #synthesis  #Lakh-MIDI 

My previous papers about music synthesis systems (such as PiaF, #166, or _JamBot, #96), have mostly relied on machine-learning techniques that approximated existing musical samples or structures.

MuseGAN, and its recent followup, BinaryMuseGAN, leverage recent developments in GAN technology to address the challenges of tempo, beat, and musicality.

The authors propose three models that each generate music-like somethings: A Jamming model, a Composer model, and a Hybrid model. The models were each provided tracks to generate bass, drums, guitar, piano, and string-instrument audio, and were trained on a large rock-music MIDI corpus. The generated samples were graded on harmony, rhythmic structure, multi-track dependency, and temporal structure.

In the Jamming model, many independent agents generate their own interal musical plans. In the Composer model, one single entity generates multiple plans. (The Hybrid model is a combination of the two.)

The results are…mostly what you’d expect. But there are a few tracks that really sound human-generated and very musical. I’d recommend taking a look and a listen.

Data and samples are available at the project’s website.