Aquifers are essential for supporting agriculture, industry, and public water systems. However, the accumulation of chemical contaminants in groundwater is a significant concern, posing threats to human health and the environment. In California, as in many other regions, groundwater is of crucial importance and its contamination is therefore of particular concern. The coexistence of nitrate and arsenic contamination in many aquifers has created a complex situation requiring urgent attention. Understanding the interconnections between these contaminants and their stressors is crucial for effective water resources management, even though this aspect remains partially unexplored despite available extensive data in California. The aim of our study is to predict nitrate and arsenic contamination in California’s aquifer using two Machine Learning (ML) algorithms: Random Forest (RF) and Dense feed-forward Neural Network (DNN). The temporal and spatial variability of these contaminants was modelled using an extensive dataset from various sources spanning over 40 years, from 1980 to 2022. Three different configurations of dataset partitioning were evaluated: a random grouping of data; the first 38 years of data for calibration, and the last 5 years for the test set; the 70% of the monitoring sites for training and the 30% as the test set. Furthermore, the importance features assessment in the prediction was performed. The RF algorithm led to better results on average than the DNN, which, however, outperformed RF in predicting the highest values. Both algorithms achieved Nash Sutcliffe Efficiency (NSE) coefficients exceeding 0.78 for the test set in the first partitioning configuration. The sensitivity analysis of the features revealed that geographic information, along with soil hydraulic properties, had the most significant impact on the prediction. The achieved results open up new possibilities for predicting future values or reconstructing time series of contaminants and also investigating the capability to simulate contamination in ungauged sites. By employing such techniques, researchers and policymakers can proactively safeguard water quality and public health, tackling contamination challenges in California and other regions facing similar concerns.

Predicting Arsenic and Nitrate in California: a spatial and temporal Machine Learning analysis / Zanoni, Maria Grazia; Loreti, Nico; Majone, Bruno; Bellin, Alberto. - ELETTRONICO. - (2023). (Intervento presentato al convegno AGU Annual Meeting 2023 tenutosi a San Francisco, CA, USA nel 11-15 December, 2023).

Predicting Arsenic and Nitrate in California: a spatial and temporal Machine Learning analysis

Zanoni, Maria Grazia;Majone, Bruno;Bellin, Alberto
2023-01-01

Abstract

Aquifers are essential for supporting agriculture, industry, and public water systems. However, the accumulation of chemical contaminants in groundwater is a significant concern, posing threats to human health and the environment. In California, as in many other regions, groundwater is of crucial importance and its contamination is therefore of particular concern. The coexistence of nitrate and arsenic contamination in many aquifers has created a complex situation requiring urgent attention. Understanding the interconnections between these contaminants and their stressors is crucial for effective water resources management, even though this aspect remains partially unexplored despite available extensive data in California. The aim of our study is to predict nitrate and arsenic contamination in California’s aquifer using two Machine Learning (ML) algorithms: Random Forest (RF) and Dense feed-forward Neural Network (DNN). The temporal and spatial variability of these contaminants was modelled using an extensive dataset from various sources spanning over 40 years, from 1980 to 2022. Three different configurations of dataset partitioning were evaluated: a random grouping of data; the first 38 years of data for calibration, and the last 5 years for the test set; the 70% of the monitoring sites for training and the 30% as the test set. Furthermore, the importance features assessment in the prediction was performed. The RF algorithm led to better results on average than the DNN, which, however, outperformed RF in predicting the highest values. Both algorithms achieved Nash Sutcliffe Efficiency (NSE) coefficients exceeding 0.78 for the test set in the first partitioning configuration. The sensitivity analysis of the features revealed that geographic information, along with soil hydraulic properties, had the most significant impact on the prediction. The achieved results open up new possibilities for predicting future values or reconstructing time series of contaminants and also investigating the capability to simulate contamination in ungauged sites. By employing such techniques, researchers and policymakers can proactively safeguard water quality and public health, tackling contamination challenges in California and other regions facing similar concerns.
2023
AGU Fall Meeting Abstracts
Washington DC, USA
American Geophysical Union
Predicting Arsenic and Nitrate in California: a spatial and temporal Machine Learning analysis / Zanoni, Maria Grazia; Loreti, Nico; Majone, Bruno; Bellin, Alberto. - ELETTRONICO. - (2023). (Intervento presentato al convegno AGU Annual Meeting 2023 tenutosi a San Francisco, CA, USA nel 11-15 December, 2023).
Zanoni, Maria Grazia; Loreti, Nico; Majone, Bruno; Bellin, Alberto
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/400618
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact