281 — Deep Watershed Detector for Music Object Recognition
Recognizing music from drawn notation is something that humans are… well… honestly, we’re really bad at it. Especially when we’re given a time constraint, this is a notoriously difficult task. (You probably know it by the name “sight-reading.”) And machines aren’t much better, especially when the notation is hand-written: Hand-written notation is much harder to understand because it doesn’t follow the exact notational typesetting procedures (think, “kerning” or “ligatures”) followed by software solutions.
This looks like a job for deep learning!
This technology uses a watershed transform with an engineered energy surface: This energy surface is metaphorically equivalent to “hoisting” regions of the input onto “dry land”: In the final segmentation results, these regions will not fall into a watershed, and will instead be considered part of a boundary class.
Tested on MUSCIMA++ and DeepScores datasets, this system achieves state of the art performance, with the greatest difficulty detecting note heads along ledger lines (which I imagine some hand-tooled training could fix).