In feature selection, a measure that captures nonlinear relationships between features and class is the mutual information (MI), which is based on how information in the features reduces the uncertainty in the output. In this paper, we propose a new measure that is related to MI, called neighborhood entropy, and a novel filter method based on its minimization in a greedy procedure. Our algorithm integrates sequential forward selection with approximated nearest-neighbors techniques and locality-sensitive hashing. Experiments show that the classification accuracy is usually higher than that of other state-of-the-art algorithms, with the best results obtained with problems that are highly unbalanced and nonlinearly separable. The order by which the features are selected is also better, leading to a higher accuracy for fewer features. The experimental results indicate that our technique can be employed effectively in offline scenarios when one can dedicate more CPU time to achieve superior...

Feature selection based on the neighborhood entropy / Battiti, Roberto; Mariello, Andrea. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-237X. - STAMPA. - 29:12(2018), pp. 6313-6322. [10.1109/TNNLS.2018.2830700]

Feature selection based on the neighborhood entropy

Battiti, Roberto;Mariello, Andrea
2018-01-01

Abstract

In feature selection, a measure that captures nonlinear relationships between features and class is the mutual information (MI), which is based on how information in the features reduces the uncertainty in the output. In this paper, we propose a new measure that is related to MI, called neighborhood entropy, and a novel filter method based on its minimization in a greedy procedure. Our algorithm integrates sequential forward selection with approximated nearest-neighbors techniques and locality-sensitive hashing. Experiments show that the classification accuracy is usually higher than that of other state-of-the-art algorithms, with the best results obtained with problems that are highly unbalanced and nonlinearly separable. The order by which the features are selected is also better, leading to a higher accuracy for fewer features. The experimental results indicate that our technique can be employed effectively in offline scenarios when one can dedicate more CPU time to achieve superior...
2018
12
Battiti, Roberto; Mariello, Andrea
Feature selection based on the neighborhood entropy / Battiti, Roberto; Mariello, Andrea. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-237X. - STAMPA. - 29:12(2018), pp. 6313-6322. [10.1109/TNNLS.2018.2830700]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/260376
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 39
  • ???jsp.display-item.citation.isi??? 33
  • OpenAlex ND
social impact