Low-cost sensors (LCSs) show a huge potential toward enabling the pervasive and continuous monitoring of crucial environmental parameters, supporting environment preservation, and informing citizens' well-being through ubiquitous air quality data. The main drawback of LCSs is that their data is usually biased, even if LCSs are calibrated by their manufacturer at production time. More accurate in-field calibration methods based on machine learning (ML) and neural networks (NNs) are being considered in some recent studies. They typically imply LCSs colocation with reference measurement stations certified by environmental agencies. Due to seasonality effects, however, the correlation between LCSs and their reference may rapidly degrade once the LCSs are moved from the calibration site, making even really accurate calibrations useless. In this work, we specifically target this problem by optimizing the training settings of the most popular ML and NN calibration models for LCSs when a sequential split schema is adopted to separate training and test sets. Then, we assess the degradation of the calibration over time based on the R2 score, when the splitting of the dataset between training and test sets is different from the classical 80%-20% ratio. This method is applied to real data gathered from an O3 sensor deployed in co-location with a certified reference station for a period of six months. Eventually, we show that, in the case of long-short term memory NNs, using 20% of the dataset for the training is a trade-off condition that minimizes the calibration effort and still yields a robust and long-lasting calibration.
Minimized Training of Machine Learning-Based Calibration Methods for Low-Cost O3 Sensors / Tondini, Stefano; Scilla, Riccardo; Casari, Paolo. - In: IEEE SENSORS JOURNAL. - ISSN 1530-437X. - 24:3(2024), pp. 3973-3987. [10.1109/jsen.2023.3339202]
Minimized Training of Machine Learning-Based Calibration Methods for Low-Cost O3 Sensors
Tondini, Stefano;Casari, Paolo
2024-01-01
Abstract
Low-cost sensors (LCSs) show a huge potential toward enabling the pervasive and continuous monitoring of crucial environmental parameters, supporting environment preservation, and informing citizens' well-being through ubiquitous air quality data. The main drawback of LCSs is that their data is usually biased, even if LCSs are calibrated by their manufacturer at production time. More accurate in-field calibration methods based on machine learning (ML) and neural networks (NNs) are being considered in some recent studies. They typically imply LCSs colocation with reference measurement stations certified by environmental agencies. Due to seasonality effects, however, the correlation between LCSs and their reference may rapidly degrade once the LCSs are moved from the calibration site, making even really accurate calibrations useless. In this work, we specifically target this problem by optimizing the training settings of the most popular ML and NN calibration models for LCSs when a sequential split schema is adopted to separate training and test sets. Then, we assess the degradation of the calibration over time based on the R2 score, when the splitting of the dataset between training and test sets is different from the classical 80%-20% ratio. This method is applied to real data gathered from an O3 sensor deployed in co-location with a certified reference station for a period of six months. Eventually, we show that, in the case of long-short term memory NNs, using 20% of the dataset for the training is a trade-off condition that minimizes the calibration effort and still yields a robust and long-lasting calibration.File | Dimensione | Formato | |
---|---|---|---|
Minimized_Training_of_Machine_Learning-Based_Calibration_Methods_for_Low-Cost_O3_Sensors.pdf
accesso aperto
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Creative commons
Dimensione
2.56 MB
Formato
Adobe PDF
|
2.56 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione