Models based on Machine Learning (ML) are pervading all fields of science and practical applications, including the problem of forecasting water temperature in lakes, a crucial variable for ecosystems and a proxy of climate change. Here, we review the most used ML algorithms in this field and highlight some physical constraints that should be carefully considered when adopting a black-box approach. To illustrate them, we refer to an artificial case study representing a temperate lake simulated by means of a physically based model, for which we take full control of input and output variables, and restrict the analysis to lake surface water temperature (LSWT). Three main factors are relevant for a successful prediction of LSWT by means of ML models: the choice of the predictors (mostly, meteorological variables), their pre-processing (we tested three approaches), and the specific ML algorithm (nine different algorithms). We show that selecting the suitable physical inputs plays the most important role. In our case study, which is the product of a numerical model and not a real lake, the minimum amount of information that is needed to obtain acceptable results is to consider air temperature (AT) and day of the year. The use of additional predictors does not substantially improve the performances (the relative improvement of RMSE was 7.75% for the test data set). We also demonstrate that better results than the normal case are obtained by either pre-processing air temperature data averaging them over a time window or including values from previous days as inputs in the model. Considering the recent history of the forcing (AT) allows one to comply with the physical fact that the large water mass makes lakes acting as “filters” in their thermal response (thus, influenced also by AT from previous days), which changes depending on the lake's depth. Eventually, we did not find a definite answer about a single optimal ML algorithm when using the same inputs (although artificial neural network had slightly better results), suggesting that the insight into the physical dynamics is still the most important factor for a successful exploitation of ML.

Critical Factors for the Use of Machine Learning to Predict Lake Surface Water Temperature / Yousefi, A.; Toffolon, M.. - In: JOURNAL OF HYDROLOGY. - ISSN 0022-1694. - 606:(2022). [10.1016/j.jhydrol.2021.127418]

Critical Factors for the Use of Machine Learning to Predict Lake Surface Water Temperature

Yousefi A.;Toffolon M.
2022-01-01

Abstract

Models based on Machine Learning (ML) are pervading all fields of science and practical applications, including the problem of forecasting water temperature in lakes, a crucial variable for ecosystems and a proxy of climate change. Here, we review the most used ML algorithms in this field and highlight some physical constraints that should be carefully considered when adopting a black-box approach. To illustrate them, we refer to an artificial case study representing a temperate lake simulated by means of a physically based model, for which we take full control of input and output variables, and restrict the analysis to lake surface water temperature (LSWT). Three main factors are relevant for a successful prediction of LSWT by means of ML models: the choice of the predictors (mostly, meteorological variables), their pre-processing (we tested three approaches), and the specific ML algorithm (nine different algorithms). We show that selecting the suitable physical inputs plays the most important role. In our case study, which is the product of a numerical model and not a real lake, the minimum amount of information that is needed to obtain acceptable results is to consider air temperature (AT) and day of the year. The use of additional predictors does not substantially improve the performances (the relative improvement of RMSE was 7.75% for the test data set). We also demonstrate that better results than the normal case are obtained by either pre-processing air temperature data averaging them over a time window or including values from previous days as inputs in the model. Considering the recent history of the forcing (AT) allows one to comply with the physical fact that the large water mass makes lakes acting as “filters” in their thermal response (thus, influenced also by AT from previous days), which changes depending on the lake's depth. Eventually, we did not find a definite answer about a single optimal ML algorithm when using the same inputs (although artificial neural network had slightly better results), suggesting that the insight into the physical dynamics is still the most important factor for a successful exploitation of ML.
2022
Yousefi, A.; Toffolon, M.
Critical Factors for the Use of Machine Learning to Predict Lake Surface Water Temperature / Yousefi, A.; Toffolon, M.. - In: JOURNAL OF HYDROLOGY. - ISSN 0022-1694. - 606:(2022). [10.1016/j.jhydrol.2021.127418]
File in questo prodotto:
File Dimensione Formato  
Yousefi_Toffolon_JH2022.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 7.45 MB
Formato Adobe PDF
7.45 MB Adobe PDF   Visualizza/Apri
Yousefi_Toffolon_JH2022-SM.pdf

accesso aperto

Descrizione: Supplementary Material
Tipologia: Altro materiale allegato (Other attachments)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 691.13 kB
Formato Adobe PDF
691.13 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/331854
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? 22
social impact