95 — JamBot: Music Theory Aware Chord Based Generation of Polyphonic Music with LSTMs

Brunner et al (1711.07682)

Read on 23 November 2017
#music  #audio  #MIDI  #LSTM  #CNN  #neural-network 

Today’s a quick one, because Thanksgiving: Earlier tonight, my aunt, parents, and grandmother played a five-hand piece on our one piano; the interactions between the notes in just about every octave of the instrument almost — but not entirely — followed a very strict musical grammar.

How human brains handle improv and musical chord progressions has been a long-standing question in the fields of psychology, music, and now, artificial intelligence.

The authors of this paper approach the problem using a LSTM, a neural network that “remembers” past inputs on a temporal sliding window. The net was fed a series of MIDI translations of musical pieces, and was asked to predict the next chord that would be played. For the purposes of this paper, only chords with more than three notes played simulateously were used.

They found that the neural network “learned” the circle of fifths from the songs: It commonly predicted the next chord along the circle when asked to predict a chord (which is often a correct choice when considering musicality of a progression, and often was the actual answer used by the musical input literature).

The network, dubbed JamBot, often produced music that followed reasonable chord progressions, but also produced dissonance where it was not appropriate, and failed to produce repeating melodies in its outputs.

This suggests that while chord progressions can be easily predicted, melodies require something a little more human, which I imagine surprises no one.