243 — A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs
George & Lázaro-Gredilla et al (10.1126/science.aag2612)
Read on 20 April 20184/20 AMAZE IT
CAPTCHAs are probably familiar to you, even if the word itself is not: They’re the “prove-you-aren’t-a-robot” mechanism employed by many sites to prevent automation. But much like every other vision-based imagery task, it’s turning out that computers are better at it than humans. Which is usually a good thing, but here, definitionally breaks the functionality of the tool.
Using a Recursive Cortical Network (RCN) — the name given to a novel type of neural network architecture that uses a structured representation of a net that has been “designed” for vision rather than conventional nets (for which image components are fully learned) — the team at Vicarious breaks the usefulness of CAPTCHA.
RCNs use their built-in hierarchical architecture to characterize the contours and edges of an object — essentially encoding object detection as a low-level function (much like we believe visual cortex does). Attributing certain contours to objects — “explaining” them away — enables the RCN to focus explicitly on non-background, non-clutter, and non-noise contours, which it can then easily classify as a digit or character.
Granted, if you can afford the compute-resources to run RCNs against CAPTCHAs, you probably don’t have a good reason to — other than to prove that you can.
Oh, and like any good researchers oughta, the team releaed their code here.