Scene description for visually impaired people with multi-label convolutional svm networks

Bazi, Y.; Alhichri, H.; Alajlan, N.; Melgani, F.

doi:10.3390/app9235062

In this paper, we present a portable camera-based method for helping visually impaired (VI) people to recognize multiple objects in images. This method relies on a novel multi-label convolutional support vector machine (CSVM) network for coarse description of images. The core idea of CSVM is to use a set of linear SVMs as filter banks for feature map generation. During the training phase, the weights of the SVM filters are obtained using a forward-supervised learning strategy unlike the backpropagation algorithm used in standard convolutional neural networks (CNNs). To handle multi-label detection, we introduce a multi-branch CSVM architecture, where each branch will be used for detecting one object in the image. This architecture exploits the correlation between the objects present in the image by means of an opportune fusion mechanism of the intermediate outputs provided by the convolution layers of each branch. The high-level reasoning of the network is done through binary classific...

In this paper, we present a portable camera-based method for helping visually impaired (VI) people to recognize multiple objects in images. This method relies on a novel multi-label convolutional support vector machine (CSVM) network for coarse description of images. The core idea of CSVM is to use a set of linear SVMs as filter banks for feature map generation. During the training phase, the weights of the SVM filters are obtained using a forward-supervised learning strategy unlike the backpropagation algorithm used in standard convolutional neural networks (CNNs). To handle multi-label detection, we introduce a multi-branch CSVM architecture, where each branch will be used for detecting one object in the image. This architecture exploits the correlation between the objects present in the image by means of an opportune fusion mechanism of the intermediate outputs provided by the convolution layers of each branch. The high-level reasoning of the network is done through binary classification SVMs for predicting the presence/absence of objects in the image. The experiments obtained on two indoor datasets and one outdoor dataset acquired from a portable camera mounted on a lightweight shield worn by the user, and connected via a USB wire to a laptop processing unit are reported and discussed.

Scene description for visually impaired people with multi-label convolutional svm networks / Bazi, Y., Alhichri, H., Alajlan, N., Melgani, F.. - In: APPLIED SCIENCES. - ISSN 2076-3417. - 9:23(2019), pp. 506201-506213. [10.3390/app9235062]

Scene description for visually impaired people with multi-label convolutional svm networks

Bazi Y.;Alhichri H.;Alajlan N.;Melgani F.

2019-01-01

Abstract

In this paper, we present a portable camera-based method for helping visually impaired (VI) people to recognize multiple objects in images. This method relies on a novel multi-label convolutional support vector machine (CSVM) network for coarse description of images. The core idea of CSVM is to use a set of linear SVMs as filter banks for feature map generation. During the training phase, the weights of the SVM filters are obtained using a forward-supervised learning strategy unlike the backpropagation algorithm used in standard convolutional neural networks (CNNs). To handle multi-label detection, we introduce a multi-branch CSVM architecture, where each branch will be used for detecting one object in the image. This architecture exploits the correlation between the objects present in the image by means of an opportune fusion mechanism of the intermediate outputs provided by the convolution layers of each branch. The high-level reasoning of the network is done through binary classific...

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2019
			
	Titolo del periodico (Journal title)
	
				APPLIED SCIENCES
			
	Numero e parte del fascicolo (Issue number and part)
	
				23
			
	DOI
	
				https://dx.doi.org/10.3390/app9235062
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85076882887
			
	Codice WOS (WOS identifier)
	
				WOS:000509476600084
			
	Tutti gli autori
	
						Bazi, Y.; Alhichri, H.; Alajlan, N.; Melgani, F.
					
	Citazione
	
				Scene description for visually impaired people with multi-label convolutional svm networks / Bazi, Y., Alhichri, H., Alajlan, N., Melgani, F.. - In: APPLIED SCIENCES. - ISSN 2076-3417. - 9:23(2019), pp. 506201-506213. [10.3390/app9235062]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
Applied Sciences-2019-CSVM-Blind.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 3.69 MB Formato Adobe PDF Visualizza/Apri	3.69 MB	Adobe PDF	Visualizza/Apri