The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Profile photo of Ronny Berndtsson

Ronny Berndtsson

Professor, Dep Director, MECW Dep Scientific Coordinator

Profile photo of Ronny Berndtsson

Interpretable machine learning for predicting the fate and transport of pentachlorophenol in groundwater

Author

  • Mehran Rad
  • Azra Abtahi
  • Ronny Berndtsson
  • Ursula S McKnight
  • Amir Aminifar

Summary, in English

Pentachlorophenol (PCP) is a commonly found recalcitrant and toxic groundwater contaminant that resists degradation, bioaccumulates, and has a potential for long-range environmental transport. Taking proper actions to deal with the pollutant accounting for the life cycle consequences requires a better understanding of its behavior in the subsurface. We recognize the huge potential for enhancing decision-making at contaminated groundwater sites with the arrival of machine learning (ML) techniques in environmental applications. We used ML to enhance the understanding of the dynamics of PCP transport properties in the subsurface, and to determine key hydrochemical and hydrogeological drivers affecting its transport and fate. We demonstrate how this complementary knowledge, provided by data-driven methods, may enable a more targeted planning of monitoring and remediation at two highly contaminated Swedish groundwater sites, where the method was validated. We evaluated 6 interpretable ML methods, 3 linear regressors and 3 non-linear (i.e., tree-based) regressors, to predict PCP concentration in the groundwater. The modeling results indicate that simple linear ML models were found to be useful in the prediction of observations for datasets without any missing values, while tree-based regressors were more suitable for datasets containing missing values. Considering that missing values are common in datasets collected during contaminated site investigations, this could be of significant importance for contaminated site planners and managers, ultimately reducing site investigation and monitoring costs. Furthermore, we interpreted the proposed models using the SHAP (SHapley Additive exPlanations) approach to decipher the importance of different drivers in the prediction and simulation of critical hydrogeochemical variables. Among these, sum of chlorophenols is of highest significance in the analyses. Setting that aside from the model, tetra chlorophenols, dissolved organic carbon, and conductivity found to be of highest importance. Accordingly, ML methods could potentially be used to improve the understanding of groundwater contamination transport dynamics, filling gaps in knowledge that remain when using more sophisticated deterministic modeling approaches.

Department/s

  • Division of Water Resources Engineering
  • LTH Profile Area: AI and Digitalization
  • LTH Profile Area: Engineering Health
  • Networks and Security
  • Centre for Advanced Middle Eastern Studies (CMES)
  • MECW: The Middle East in the Contemporary World
  • LTH Profile Area: Water
  • Department of Electrical and Information Technology

Publishing year

2024-03-15

Language

English

Publication/Series

Environmental Pollution

Volume

345

Document type

Journal article

Publisher

Elsevier

Topic

  • Environmental Sciences

Keywords

  • Contaminated sites
  • Explainable artificial intelligence
  • SHAP value
  • Sustainable remediation
  • Tree-based regression

Status

Published

Research group

  • Networks and Security

ISBN/ISSN/Other

  • ISSN: 0269-7491