A common approach to improve the robustness of synthetic image detectors against image post-processing is to augment the dataset the detectors are trained on by applying a selected pool of image processing operators. A list of commonly adopted image processing augmentations includes JPEG compression, geometric transformations, color adjustment, noise addition, and filtering. Robustness against image processing operators that are not included in the augmentation pool, however, is problematic since the detectors tend to overfit to the image operators used during training, without generalizing to other kinds of processing. In this paper, we introduce a new form of data augmentation based on the simulation of the Print & Scan (P&S) process. We argue that asking the synthetic image detector to still work after that an image has been printed and scanned, forces the detector to rely on robust features that can be detected even after other forms of processing. Given the impossibility of creating a large enough dataset of P&S images, we trained a CycleGAN network to simulate the P&S process and used it for data augmentation. The results we got by applying the above procedure to a detector trained to distinguish real and synthetic images in different domains show that P&S augmentation improves the robustness of the detectors even on images processed by operators that have not been used during training.
Improving the Robustness of Synthetic Images Detection by Means of Print and Scan Augmentation
Nischay Purnekar
Primo
;
2024-01-01
Abstract
A common approach to improve the robustness of synthetic image detectors against image post-processing is to augment the dataset the detectors are trained on by applying a selected pool of image processing operators. A list of commonly adopted image processing augmentations includes JPEG compression, geometric transformations, color adjustment, noise addition, and filtering. Robustness against image processing operators that are not included in the augmentation pool, however, is problematic since the detectors tend to overfit to the image operators used during training, without generalizing to other kinds of processing. In this paper, we introduce a new form of data augmentation based on the simulation of the Print & Scan (P&S) process. We argue that asking the synthetic image detector to still work after that an image has been printed and scanned, forces the detector to rely on robust features that can be detected even after other forms of processing. Given the impossibility of creating a large enough dataset of P&S images, we trained a CycleGAN network to simulate the P&S process and used it for data augmentation. The results we got by applying the above procedure to a detector trained to distinguish real and synthetic images in different domains show that P&S augmentation improves the robustness of the detectors even on images processed by operators that have not been used during training.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


