In this work, we propose a new algorithm to improve existing techniques used in the field of spectroscopic data regression analysis. In particular, it combines the power of nonlinear kernel regressors (kernel ridge regression [KRR], kernel principal component regression [KPCR], and Gaussian process regression [GPR]) with an optimization based on nondominated sorting multi-objective genetic algorithm (NSGAII) to filter the residual outliers in the prediction space and leverage points in the features space. The proposed algorithm, contrary to most existing robust algorithms, simultaneously optimizes many complementary objectives for an automatic adaptation and thus a better outliers detection. It is well known that the elimination of outliers greatly improves the regression model. It is thus the aim of this work to develop a new robust regression algorithm. It has been applied on five different datasets, and the results are compared to both classical nonlinear regression methods and the commonly used robust regression methods robust continuum regression (RCR), partial robust M-regression (PRM), robust principal component regression (RPCR), robust PLSR (RSIMPLS), and locally weighted regression (LWR). They show that the proposed algorithm outperforms the classical nonlinear regression methods and is a promising competitor to the robust methods outperforming most of them. Even though the results obtained are only from five datasets, this algorithm can be considered an interesting contribution for improving data analysis in the field of chemometrics.

Genetic robust kernel sample selection for chemometric data analysis / Douak, F.; Ghoggali, N.; Hedjam, R.; Mekhalfi, M. L.; Benoudjit, N.; Melgani, F.. - In: JOURNAL OF CHEMOMETRICS. - ISSN 0886-9383. - 35:6(2021), pp. e334401-e334422. [10.1002/cem.3344]

Genetic robust kernel sample selection for chemometric data analysis

Ghoggali N.;Mekhalfi M. L.;Melgani F.
2021-01-01

Abstract

In this work, we propose a new algorithm to improve existing techniques used in the field of spectroscopic data regression analysis. In particular, it combines the power of nonlinear kernel regressors (kernel ridge regression [KRR], kernel principal component regression [KPCR], and Gaussian process regression [GPR]) with an optimization based on nondominated sorting multi-objective genetic algorithm (NSGAII) to filter the residual outliers in the prediction space and leverage points in the features space. The proposed algorithm, contrary to most existing robust algorithms, simultaneously optimizes many complementary objectives for an automatic adaptation and thus a better outliers detection. It is well known that the elimination of outliers greatly improves the regression model. It is thus the aim of this work to develop a new robust regression algorithm. It has been applied on five different datasets, and the results are compared to both classical nonlinear regression methods and the commonly used robust regression methods robust continuum regression (RCR), partial robust M-regression (PRM), robust principal component regression (RPCR), robust PLSR (RSIMPLS), and locally weighted regression (LWR). They show that the proposed algorithm outperforms the classical nonlinear regression methods and is a promising competitor to the robust methods outperforming most of them. Even though the results obtained are only from five datasets, this algorithm can be considered an interesting contribution for improving data analysis in the field of chemometrics.
2021
6
Douak, F.; Ghoggali, N.; Hedjam, R.; Mekhalfi, M. L.; Benoudjit, N.; Melgani, F.
Genetic robust kernel sample selection for chemometric data analysis / Douak, F.; Ghoggali, N.; Hedjam, R.; Mekhalfi, M. L.; Benoudjit, N.; Melgani, F.. - In: JOURNAL OF CHEMOMETRICS. - ISSN 0886-9383. - 35:6(2021), pp. e334401-e334422. [10.1002/cem.3344]
File in questo prodotto:
File Dimensione Formato  
Journal of Chemometrics-2021-Sample Selection.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 21.12 MB
Formato Adobe PDF
21.12 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/329642
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact