185 — Neural Turing Machines
Graves, Wayne & Danihelka (1410.5401)
Read on 21 February 2018In this work, Google designs a neural network architecture that is roughly analogous to a Turing Machine in that it contains both computational components — the role of the conventional neural network (the “program” of the TM) — as well as external memory resources (the “tape” of the TM).
I read about this sort of “hybrid computing” previously — #21 — which was written by a superset of these authors. This new hybrid, dubbed a Neural Turing Machine, uses an addressable memory space as a long-term storage device. NTMs are fully differentiable.
The network accesses memory by designing a key vector that points to a location or set of locations in the memory bank. This address can be accessed using a “read head” or written to using a “write head.” Read and write heads are synonymous in construction but can switch between read and write at the discretion of the controller network.
Because the network possesses this analog of working memory, the authors show that it is possible to teach the network a simple algorithm which the network can generalize “well outside its training regime”.
This work is particularly interesting to me because it enables much longer-term memory than existing systems like LSTMs currently afford alone. NTMs also treat memory as a probablistic function, which is an interesting approach for nondeterministic systems which have always relied upon static, reliable hard-drive-style memory.