In the pattern recognition community, one of the most critical problems in the design of supervised classification and regression systems is given by the quality and the quantity of the exploited training samples (ground-truth). This problem is particularly important in such applications in which the process of training sample collection is an expensive and time consuming task subject to different sources of errors. Active learning represents an interesting approach proposed in the literature to address the problem of ground-truth collection, in which training samples are selected in an iterative way in order to minimize the number of involved samples and the intervention of human users. In this thesis, new methodologies of active learning for classification and regression problems are proposed and applied in three main application fields, which are the remote sensing, biomedical, and chemometrics fields. In particular, the proposed methodological contributions include: i) three strategies for the support vector machine (SVM) classification of electrocardiographic signals; ii) a strategy for SVM classification in the context of remote sensing images; iii) combination of spectral and spatial information in the context of active learning for remote sensing image classification; iv) exploitation of active learning to solve the problem of covariate shift, which may occur when a classifier trained on a portion of the image is applied to the rest of the image; moreover, several strategies for regression problems are proposed to estimate v) biophysical parameters from remote sensing data and vi) chemical concentrations from spectroscopic data; vii) a framework for assisting a human user in the design of a ground-truth for classifying a given optical remote sensing image. Experiments conducted on simulated and real data sets are reported and discussed. They all suggest that, despite their complexity, ground-truth collection problems can be tackled satisfactory by the proposed approaches.

Active learning methods for classification and regression problems / Pasolli, Edoardo. - (2011), pp. 1-133.

Active learning methods for classification and regression problems

Pasolli, Edoardo
2011-01-01

Abstract

In the pattern recognition community, one of the most critical problems in the design of supervised classification and regression systems is given by the quality and the quantity of the exploited training samples (ground-truth). This problem is particularly important in such applications in which the process of training sample collection is an expensive and time consuming task subject to different sources of errors. Active learning represents an interesting approach proposed in the literature to address the problem of ground-truth collection, in which training samples are selected in an iterative way in order to minimize the number of involved samples and the intervention of human users. In this thesis, new methodologies of active learning for classification and regression problems are proposed and applied in three main application fields, which are the remote sensing, biomedical, and chemometrics fields. In particular, the proposed methodological contributions include: i) three strategies for the support vector machine (SVM) classification of electrocardiographic signals; ii) a strategy for SVM classification in the context of remote sensing images; iii) combination of spectral and spatial information in the context of active learning for remote sensing image classification; iv) exploitation of active learning to solve the problem of covariate shift, which may occur when a classifier trained on a portion of the image is applied to the rest of the image; moreover, several strategies for regression problems are proposed to estimate v) biophysical parameters from remote sensing data and vi) chemical concentrations from spectroscopic data; vii) a framework for assisting a human user in the design of a ground-truth for classifying a given optical remote sensing image. Experiments conducted on simulated and real data sets are reported and discussed. They all suggest that, despite their complexity, ground-truth collection problems can be tackled satisfactory by the proposed approaches.
2011
XXIV
2010-2011
Ingegneria e Scienza dell'Informaz (cess.4/11/12)
Information and Communication Technology
Melgani, Farid
no
Inglese
Settore ING-INF/03 - Telecomunicazioni
File in questo prodotto:
File Dimensione Formato  
PhD-Thesis-Pasolli.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 5.37 MB
Formato Adobe PDF
5.37 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368080
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact