Deep neural network models for image classification and regression

Malek, Salim

doi:10.15168/11572_368992

Deep learning, a branch of machine learning, has been gaining ground in many research fields as well as practical applications. Such ongoing boom can be traced back mainly to the availability and the affordability of potential processing facilities, which were not widely accessible than just a decade ago for instance. Although it has demonstrated cutting-edge performance widely in computer vision, and particularly in object recognition and detection, deep learning is yet to find its way into other research areas. Furthermore, the performance of deep learning models has a strong dependency on the way in which these latter are designed/tailored to the problem at hand. This, thereby, raises not only precision concerns but also processing overheads. The success and applicability of a deep learning system relies jointly on both components. In this dissertation, we present innovative deep learning schemes, with application to interesting though less-addressed topics. In this respect, the first covered topic is rough scene description for visually impaired individuals, whose idea is to list the objects that likely exist in an image that is grabbed by a visually impaired person, To this end, we proceed by extracting several features from the respective query image in order to capture the textural as well as the chromatic cues therein. Further, in order to improve the representativeness of the extracted features, we reinforce them with a feature learning stage by means of an autoencoder model. This latter is topped with a logistic regression layer in order to detect the presence of objects if any. In a second topic, we suggest to exploit the same model, i.e., autoencoder in the context of cloud removal in remote sensing images. Briefly, the model is learned on a cloud-free image pertaining to a certain geographical area, and applied afterwards on another cloud-contaminated image, acquired at a different time instant, of the same area. Two reconstruction strategies are proposed, namely pixel-based and patch-based reconstructions. From the earlier two topics, we quantitatively demonstrate that autoencoders can play a pivotal role in terms of both (i) feature learning and (ii) reconstruction and mapping of sequential data. Convolutional Neural Network (CNN) is arguably the most utilized model by the computer vision community, which is reasonable thanks to its remarkable performance in object and scene recognition, with respect to traditional hand-crafted features. Nevertheless, it is evident that CNN naturally is availed in its two-dimensional version. This raises questions on its applicability to unidimensional data. Thus, a third contribution of this thesis is devoted to the design of a unidimensional architecture of the CNN, which is applied to spectroscopic data. In other terms, CNN is tailored for feature extraction from one-dimensional chemometric data, whilst the extracted features are fed into advanced regression methods to estimate underlying chemical component concentrations. Experimental findings suggest that, similarly to 2D CNNs, unidimensional CNNs are also prone to impose themselves with respect to traditional methods. The last contribution of this dissertation is to develop new method to estimate the connection weights of the CNNs. It is based on training an SVM for each kernel of the CNN. Such method has the advantage of being fast and adequate for applications that characterized by small datasets.

Deep neural network models for image classification and regression / Malek, Salim. - (2018), pp. 1-89. [10.15168/11572_368992]

Deep neural network models for image classification and regression

Malek, Salim

2018-01-01

Abstract

Deep learning, a branch of machine learning, has been gaining ground in many research fields as well as practical applications. Such ongoing boom can be traced back mainly to the availability and the affordability of potential processing facilities, which were not widely accessible than just a decade ago for instance. Although it has demonstrated cutting-edge performance widely in computer vision, and particularly in object recognition and detection, deep learning is yet to find its way into other research areas. Furthermore, the performance of deep learning models has a strong dependency on the way in which these latter are designed/tailored to the problem at hand. This, thereby, raises not only precision concerns but also processing overheads. The success and applicability of a deep learning system relies jointly on both components. In this dissertation, we present innovative deep learning schemes, with application to interesting though less-addressed topics. In this respect, the first covered topic is rough scene description for visually impaired individuals, whose idea is to list the objects that likely exist in an image that is grabbed by a visually impaired person, To this end, we proceed by extracting several features from the respective query image in order to capture the textural as well as the chromatic cues therein. Further, in order to improve the representativeness of the extracted features, we reinforce them with a feature learning stage by means of an autoencoder model. This latter is topped with a logistic regression layer in order to detect the presence of objects if any. In a second topic, we suggest to exploit the same model, i.e., autoencoder in the context of cloud removal in remote sensing images. Briefly, the model is learned on a cloud-free image pertaining to a certain geographical area, and applied afterwards on another cloud-contaminated image, acquired at a different time instant, of the same area. Two reconstruction strategies are proposed, namely pixel-based and patch-based reconstructions. From the earlier two topics, we quantitatively demonstrate that autoencoders can play a pivotal role in terms of both (i) feature learning and (ii) reconstruction and mapping of sequential data. Convolutional Neural Network (CNN) is arguably the most utilized model by the computer vision community, which is reasonable thanks to its remarkable performance in object and scene recognition, with respect to traditional hand-crafted features. Nevertheless, it is evident that CNN naturally is availed in its two-dimensional version. This raises questions on its applicability to unidimensional data. Thus, a third contribution of this thesis is devoted to the design of a unidimensional architecture of the CNN, which is applied to spectroscopic data. In other terms, CNN is tailored for feature extraction from one-dimensional chemometric data, whilst the extracted features are fed into advanced regression methods to estimate underlying chemical component concentrations. Experimental findings suggest that, similarly to 2D CNNs, unidimensional CNNs are also prone to impose themselves with respect to traditional methods. The last contribution of this dissertation is to develop new method to estimate the connection weights of the CNNs. It is based on training an SVM for each kernel of the CNN. Such method has the advantage of being fast and adequate for applications that characterized by small datasets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				2018
			
	Ciclo
	
				XXX
			
	Anno Accademico
	
				2018-2019
			
	Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di dottorato
	
				Informatica e telecomunicazioni (fino a.a. 2020-21, 36° ciclo)
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Melgani, Farid
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Codice DOI
	
				https://dx.doi.org/10.15168/11572_368992
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore ING-INF/03 - Telecomunicazioni
Settore MAT/06 - Probabilita' e Statistica Matematica
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
These_final02.pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 4.35 MB Formato Adobe PDF Visualizza/Apri	4.35 MB	Adobe PDF	Visualizza/Apri
Disclaimer_Malek.pdf Solo gestori archivio Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.04 MB Formato Adobe PDF Visualizza/Apri	1.04 MB	Adobe PDF	Visualizza/Apri