We show that it is possible to reduce the size of a classification problem by automatically ranking the relative importance of available features. Variables are importance-sorted with a decision tree algorithm and correlated ones are removed after ranking. The selected features can be used as input quantities for the classification problem at hand. We tested the method with the case of highly boosted di-jet resonances decaying to two b-quarks, to be selected against an overwhelming QCD background with a Deep Neural network. We make it explicit the relation between different importance rankings obtained with different algorithms. We also show how the signal-to-background ratio changes, varying the number of features to feed the Neural Network with.
Automated selection of particle-jet features for data analysis in high energy physics experiments / Di Luca, A.; Follega, F. M.; Cristoforetti, M.; Iuppa, R.. - In: POS PROCEEDINGS OF SCIENCE. - ISSN 1824-8039. - ELETTRONICO. - 390:(2021). (Intervento presentato al convegno 40th International Conference on High Energy Physics, ICHEP 2020 tenutosi a cze nel 2020).
Automated selection of particle-jet features for data analysis in high energy physics experiments
Di Luca A.;Follega F. M.;Cristoforetti M.;Iuppa R.
2021-01-01
Abstract
We show that it is possible to reduce the size of a classification problem by automatically ranking the relative importance of available features. Variables are importance-sorted with a decision tree algorithm and correlated ones are removed after ranking. The selected features can be used as input quantities for the classification problem at hand. We tested the method with the case of highly boosted di-jet resonances decaying to two b-quarks, to be selected against an overwhelming QCD background with a Deep Neural network. We make it explicit the relation between different importance rankings obtained with different algorithms. We also show how the signal-to-background ratio changes, varying the number of features to feed the Neural Network with.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione