Models based on Machine Learning (ML) are pervading all fields of science and practical applications, including the problem of forecasting water temperature in lakes, a crucial variable for ecosystems and a proxy of climate change. Here, we review the most used ML algorithms in this field and highlight some physical constraints that should be carefully considered when adopting a black-box approach. To illustrate them, we refer to an artificial case study representing a temperate lake simulated by means of a physically based model, for which we take full control of input and output variables, and restrict the analysis to lake surface water temperature (LSWT). Three main factors are relevant for a successful prediction of LSWT by means of ML models: the choice of the predictors (mostly, meteorological variables), their pre-processing (we tested three approaches), and the specific ML algorithm (nine different algorithms). We show that selecting the suitable physical inputs plays the most important role. In our case study, which is the product of a numerical model and not a real lake, the minimum amount of information that is needed to obtain acceptable results is to consider air temperature (AT) and day of the year. The use of additional predictors does not substantially improve the performances (the relative improvement of RMSE was 7.75% for the test data set). We also demonstrate that better results than the normal case are obtained by either pre-processing air temperature data averaging them over a time window or including values from previous days as inputs in the model. Considering the recent history of the forcing (AT) allows one to comply with the physical fact that the large water mass makes lakes acting as “filters” in their thermal response (thus, influenced also by AT from previous days), which changes depending on the lake's depth. Eventually, we did not find a definite answer about a single optimal ML algorithm when using the same inputs (although artificial neural network had slightly better results), suggesting that the insight into the physical dynamics is still the most important factor for a successful exploitation of ML.
Critical Factors for the Use of Machine Learning to Predict Lake Surface Water Temperature / Yousefi, A.; Toffolon, M.. - In: JOURNAL OF HYDROLOGY. - ISSN 0022-1694. - 606:(2022). [10.1016/j.jhydrol.2021.127418]
Critical Factors for the Use of Machine Learning to Predict Lake Surface Water Temperature
Yousefi A.;Toffolon M.
2022-01-01
Abstract
Models based on Machine Learning (ML) are pervading all fields of science and practical applications, including the problem of forecasting water temperature in lakes, a crucial variable for ecosystems and a proxy of climate change. Here, we review the most used ML algorithms in this field and highlight some physical constraints that should be carefully considered when adopting a black-box approach. To illustrate them, we refer to an artificial case study representing a temperate lake simulated by means of a physically based model, for which we take full control of input and output variables, and restrict the analysis to lake surface water temperature (LSWT). Three main factors are relevant for a successful prediction of LSWT by means of ML models: the choice of the predictors (mostly, meteorological variables), their pre-processing (we tested three approaches), and the specific ML algorithm (nine different algorithms). We show that selecting the suitable physical inputs plays the most important role. In our case study, which is the product of a numerical model and not a real lake, the minimum amount of information that is needed to obtain acceptable results is to consider air temperature (AT) and day of the year. The use of additional predictors does not substantially improve the performances (the relative improvement of RMSE was 7.75% for the test data set). We also demonstrate that better results than the normal case are obtained by either pre-processing air temperature data averaging them over a time window or including values from previous days as inputs in the model. Considering the recent history of the forcing (AT) allows one to comply with the physical fact that the large water mass makes lakes acting as “filters” in their thermal response (thus, influenced also by AT from previous days), which changes depending on the lake's depth. Eventually, we did not find a definite answer about a single optimal ML algorithm when using the same inputs (although artificial neural network had slightly better results), suggesting that the insight into the physical dynamics is still the most important factor for a successful exploitation of ML.File | Dimensione | Formato | |
---|---|---|---|
Yousefi_Toffolon_JH2022.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
7.45 MB
Formato
Adobe PDF
|
7.45 MB | Adobe PDF | Visualizza/Apri |
Yousefi_Toffolon_JH2022-SM.pdf
accesso aperto
Descrizione: Supplementary Material
Tipologia:
Altro materiale allegato (Other attachments)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
691.13 kB
Formato
Adobe PDF
|
691.13 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione