Detecting adversarial examples - A lesson from multimedia security

Schöttle, Pascal; Schlögl, Alexander; Pasquini, Cecilia; Böhme, Rainer

doi:10.23919/EUSIPCO.2018.8553164

Adversarial classification is the task of performing robust classification in the presence of a strategic attacker. Originating from information hiding and multimedia forensics, adversarial classification recently received a lot of attention in a broader security context. In the domain of machine learning-based image classification, adversarial classification can be interpreted as detecting so-called adversarial examples, which are slightly altered versions of benign images. They are specifically crafted to be misclassified with a very high probability by the classifier under attack. Neural networks, which dominate among modern image classifiers, have been shown to be especially vulnerable to these adversarial examples. However, detecting subtle changes in digital images has always been the goal of multimedia forensics and steganalysis, two major subfields of multimedia security. We highlight the conceptual similarities between these fields and secure machine learning. Furthermore, we adapt a linear filter, similar to early steganalysis methods, to detect adversarial examples that are generated with the projected gradient descent (PGD) method, the state-of-the-art algorithm for this task. We test our method on the MNIST database and show for several parameter combinations of PGD that our method can reliably detect adversarial examples. Additionally, the combination of adversarial re-training and our detection method effectively reduces the attack surface of attacks against neural networks. Thus, we conclude that adversarial examples for image classification possibly do not withstand detection methods from steganalysis, and future work should explore the effectiveness of known techniques from multimedia security in other adversarial settings.

Detecting adversarial examples - A lesson from multimedia security / Schöttle, Pascal; Schlögl, Alexander; Pasquini, Cecilia; Böhme, Rainer. - (2018), pp. 947-951. (Intervento presentato al convegno EUSIPCO 2018 tenutosi a Roma, Italia nel 3rd-7th September 2018) [10.23919/EUSIPCO.2018.8553164].

Detecting adversarial examples - A lesson from multimedia security

Schöttle, Pascal;Schlögl, Alexander;Pasquini, Cecilia;Böhme, Rainer

2018-01-01

Abstract

Adversarial classification is the task of performing robust classification in the presence of a strategic attacker. Originating from information hiding and multimedia forensics, adversarial classification recently received a lot of attention in a broader security context. In the domain of machine learning-based image classification, adversarial classification can be interpreted as detecting so-called adversarial examples, which are slightly altered versions of benign images. They are specifically crafted to be misclassified with a very high probability by the classifier under attack. Neural networks, which dominate among modern image classifiers, have been shown to be especially vulnerable to these adversarial examples. However, detecting subtle changes in digital images has always been the goal of multimedia forensics and steganalysis, two major subfields of multimedia security. We highlight the conceptual similarities between these fields and secure machine learning. Furthermore, we adapt a linear filter, similar to early steganalysis methods, to detect adversarial examples that are generated with the projected gradient descent (PGD) method, the state-of-the-art algorithm for this task. We test our method on the MNIST database and show for several parameter combinations of PGD that our method can reliably detect adversarial examples. Additionally, the combination of adversarial re-training and our detection method effectively reduces the attack surface of attacks against neural networks. Thus, we conclude that adversarial examples for image classification possibly do not withstand detection methods from steganalysis, and future work should explore the effectiveness of known techniques from multimedia security in other adversarial settings.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2018
			
	Titolo del volume (Proceedings title)
	
				EUSIPCO 2018: 26th European Signal Processing Conference
			
	Luogo di edizione (Place of publication)
	
				Piscataway, NJ
			
	Casa editrice (Publisher)
	
				IEEE
			
	ISBN
	
				978-9-0827-9701-5
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85059814924
			
	Codice WOS (WOS identifier)
	
				WOS:000455614900191
			
	Tutti gli autori
	
						Schöttle, Pascal; Schlögl, Alexander; Pasquini, Cecilia; Böhme, Rainer
					
	Citazione
	
				Detecting adversarial examples - A lesson from multimedia security / Schöttle, Pascal; Schlögl, Alexander; Pasquini, Cecilia; Böhme, Rainer. - (2018), pp. 947-951. (Intervento presentato al  convegno EUSIPCO 2018 tenutosi a Roma, Italia nel 3rd-7th September 2018) [10.23919/EUSIPCO.2018.8553164].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
EUSIPCO2018.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 262.67 kB Formato Adobe PDF Visualizza/Apri	262.67 kB	Adobe PDF	Visualizza/Apri