365Papers

281 — Deep Watershed Detector for Music Object Recognition

Read on 28 May 2018
#OMR #typography #neural-network #machine-learning #watershed #computer-vision #CV #MUSCIMA++ #DeepScores #music #sheet-music #notation

Recognizing music from drawn notation is something that humans are… well… honestly, we’re really bad at it. Especially when we’re given a time constraint, this is a notoriously difficult task. (You probably know it by the name “sight-reading.”) And machines aren’t much better, especially when the notation is hand-written: Hand-written notation is much harder to understand because it doesn’t follow the exact notational typesetting procedures (think, “kerning” or “ligatures”) followed by software solutions.

This looks like a job for deep learning!

This technology uses a watershed transform with an engineered energy surface: This energy surface is metaphorically equivalent to “hoisting” regions of the input onto “dry land”: In the final segmentation results, these regions will not fall into a watershed, and will instead be considered part of a boundary class.

Tested on MUSCIMA++ and DeepScores datasets, this system achieves state of the art performance, with the greatest difficulty detecting note heads along ledger lines (which I imagine some hand-tooled training could fix).

Comments?

@j6m8