Anomaly detection consists in automatically detecting the most unusual elements in a data set. Anomaly detection applications emerge in domains such as computer security, system monitoring, fault detection, and wireless sensor networks. The strategic importance of detecting anomalies in these domains makes anomaly detection a critical data analysis task. Moreover, the contextual nature of anomalies, among other issues, makes anomaly detection a particularly challenging problem. Anomaly detection has received significant research attention in the last two decades. Much effort has been invested in the development of novel algorithms for anomaly detection. However, several open challenges still exist in the field.This thesis presents our contributions toward solving these challenges. These contributions include: a methodological survey of the recent literature, a novel benchmarking framework for anomaly detection algorithms, an approach for scaling anomaly detection techniques to massive data sets, and a novel anomaly detection algorithm inspired by the law of universal gravitation. Our methodological survey highlights open challenges in the field, and it provides some motivation for our other contributions. Our benchmarking framework, named BAD, tackles the problem of reliably assess the accuracy of unsupervised anomaly detection algorithms. BAD leverages parallel and distributed computing to enable massive comparison studies and hyperparameter tuning tasks. The challenge of scaling unsupervised anomaly detection techniques to massive data sets is well-known in the literature. In this context, our contributions are twofold: we investigate the trade-offs between a single-threaded implementation and a distributed approach considering price-performance metrics, and we propose a scalable approach for anomaly detection algorithms to arbitrary data volumes. Our results show that, when high scalability is required, our approach can handle arbitrarily large data sets without significantly compromising detection accuracy. We conclude our contributions by proposing a novel algorithm for anomaly detection, named Gravity. Gravity identifies anomalies by considering the attraction forces among massive data elements. Our evaluation shows that Gravity is competitive with other popular anomaly detection techniques on several benchmark data sets. Additionally, the properties of Gravity makes it preferable in cases where hyperparameter tuning is challenging or unfeasible.
Modern Anomaly Detection: Benchmarking, Scalability and a Novel Approach / Pasupathipillai, Sivam. - (2020 Nov 27), pp. 1-117.
|Titolo:||Modern Anomaly Detection: Benchmarking, Scalability and a Novel Approach|
|Anno di pubblicazione:||2020-11-27|
|Struttura:||Dipartimento di Ingegneria e Scienza dell'Informazione|
|Corso di dottorato:||Information and Communication Technology|
|Tutor esterno:||Della Valle, Emanuele|
|Supervisori e coordinatori:||Velegrakis, Ioannis|
|Tesi in cotutela:||no|
|Settore Scientifico Disciplinare:||Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni|
|Digital Object Identifier (DOI):||10.15168/11572_281952|
|Appare nelle tipologie:||08.1 Tesi di dottorato (Doctoral Thesis)|
File in questo prodotto:
|thesis_final.pdf||Documento principale||Tesi di dottorato (Doctoral Thesis)||Embargo: 01/11/2021|