74 — Learning deep features for source color laser printer identification based on cascaded learning

Kim et al (1711.00207)

Read on 03 November 2017
#print  #forensics  #neural-network  #CNN  #GAN 

Laser printers are used for a variety of commercial tasks, and forgeries printed using laser printers are quite common due to their high speed and quality of output. The ink drums of different printers are often very slightly different: This is a result of the manufacturer, the software used to mix colors, and other factors.

The authors first present a “reversal” algorithm that takes a RGB image and separates it into CMYK ink channels. This is an important step because these separate channels used by a printer are not the same as the digital light channels used by the image storage format to store the scanned image.

Once this discomposition is complete, the authors use a novel CNN architecture to classify the zoomed and scanned picture based on its most likely source printer, from a small population of printers used in the experiments.

From these trials, the authors were able to achieve over 90% accuracy in predicting which printer printed a document or photograph, which is dramatically higher than their listed competitive algorithms.

The authors note, however, that this classifier fails to identify when the target printer cannot be found in the population of known printers; because the paper only uses eight printers as options, I think that the algorithm’s identification accuracy might suffer when introduced to a larger range of printers (since the latent feature space will grow denser).