Generative adversarial networks for subdomain enumeration

Degani, L.; Bergadano, F.; Mirheidari, S. A.; Martinelli, F.; Crispo, B.

doi:10.1145/3477314.3506967

Subdomain enumeration is a fundamental step of many security processes (i.e., vulnerability discovery, OSINT, host enumeration, etc.). Up to now, this has been achieved with deterministic procedures that have shown some limitations. For instance, the process typically requires the generation of a candidate, which is subsequently checked for validity. While the validation is a straightforward procedure, the definition of an optimal candidate generation strategy is still an open problem. This paper presents a novel subdomain enumeration tool that allows the generation of high-quality subdomain candidates. We employ a Generative Adversarial Network (GAN) to sample unseen candidates from the distribution of valid subdomain names. The model learns this distribution from publicly available datasets. Moreover, by sampling from the trained model, we address the limitations of traditional algorithms. Our experiments were carried out against 15 domains and a ground truth of 1164 other targets. The 15 domains were carefully selected from bug bounty platforms to avoid terms of use violations. Several factors influenced the choices, including the popularity, the expected number of subdomains, and the available services. Our experiments aim to validate our approach by testing the performance increase in subdomain enumeration processes against the state-of-the-art. We benchmark our proposal in terms of candidates’ validity and sample uniqueness. The results showed that, with our GAN, the performance of a traditional subdomain enumeration workflow increased by up to 61%. In addition, according to our ground truth experiments, the GAN was able to guess, on average, 32% of subdomains.

Generative adversarial networks for subdomain enumeration / Degani, L.; Bergadano, F.; Mirheidari, S. A.; Martinelli, F.; Crispo, B.. - (2022), pp. 1636-1645. ( 37th ACM/SIGAPP Symposium on Applied Computing, SAC 2022 Virtual 25 - 29 April, 2022) [10.1145/3477314.3506967].

Generative adversarial networks for subdomain enumeration

Degani L.;Bergadano F.;Mirheidari S. A.;Martinelli F.;Crispo B.

2022-01-01

Abstract

Subdomain enumeration is a fundamental step of many security processes (i.e., vulnerability discovery, OSINT, host enumeration, etc.). Up to now, this has been achieved with deterministic procedures that have shown some limitations. For instance, the process typically requires the generation of a candidate, which is subsequently checked for validity. While the validation is a straightforward procedure, the definition of an optimal candidate generation strategy is still an open problem. This paper presents a novel subdomain enumeration tool that allows the generation of high-quality subdomain candidates. We employ a Generative Adversarial Network (GAN) to sample unseen candidates from the distribution of valid subdomain names. The model learns this distribution from publicly available datasets. Moreover, by sampling from the trained model, we address the limitations of traditional algorithms. Our experiments were carried out against 15 domains and a ground truth of 1164 other targets. The 15 domains were carefully selected from bug bounty platforms to avoid terms of use violations. Several factors influenced the choices, including the popularity, the expected number of subdomains, and the available services. Our experiments aim to validate our approach by testing the performance increase in subdomain enumeration processes against the state-of-the-art. We benchmark our proposal in terms of candidates’ validity and sample uniqueness. The results showed that, with our GAN, the performance of a traditional subdomain enumeration workflow increased by up to 61%. In addition, according to our ground truth experiments, the GAN was able to guess, on average, 32% of subdomains.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2022
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing
			
	Luogo di edizione (Place of publication)
	
				1601 Broadway, 10th Floor, NEW YORK, NY, UNITED STATES
			
	Casa editrice (Publisher)
	
				Association for Computing Machinery
			
	ISBN
	
				9781450387132
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85130417044
			
	Codice WOS (WOS identifier)
	
				WOS:000946564100223
			
	Tutti gli autori
	
						Degani, L.; Bergadano, F.; Mirheidari, S. A.; Martinelli, F.; Crispo, B.
					
	Citazione
	
				Generative adversarial networks for subdomain enumeration / Degani, L.; Bergadano, F.; Mirheidari, S. A.; Martinelli, F.; Crispo, B.. - (2022), pp. 1636-1645. ( 37th ACM/SIGAPP Symposium on Applied Computing, SAC 2022 Virtual 25 - 29 April, 2022) [10.1145/3477314.3506967].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
3477314.3506967.pdf accesso aperto Descrizione: Paper Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.2 MB Formato Adobe PDF Visualizza/Apri	1.2 MB	Adobe PDF	Visualizza/Apri