Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters
This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the chang...
Published in: | Environment International |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Elsevier Ltd
2023
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85153537218&doi=10.1016%2fj.envint.2023.107931&partnerID=40&md5=91f7aa5b1ffbb13a2eae2389f70ff7ea |
id |
2-s2.0-85153537218 |
---|---|
spelling |
2-s2.0-85153537218 Tao H.; Jawad A.H.; Shather A.H.; Al-Khafaji Z.; Rashid T.A.; Ali M.; Al-Ansari N.; Marhoon H.A.; Shahid S.; Yaseen Z.M. Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters 2023 Environment International 175 10.1016/j.envint.2023.107931 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85153537218&doi=10.1016%2fj.envint.2023.107931&partnerID=40&md5=91f7aa5b1ffbb13a2eae2389f70ff7ea This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the changing patterns of four European Reanalysis (ERA5) meteorological variables, rainfall, mean temperature, wind speed and relative humidity, and one soil parameter, the soil moisture, were used to select the suitable set of predictors using a non-greedy algorithm known as simulated annealing (SA). The selected predictors were used to simulate the temporal and spatial variability of air PM2.5 concentration over Iraq during the early summer (May-July), the most polluted months, using three advanced ML models, extremely randomized trees (ERT), stochastic gradient descent backpropagation (SGD-BP) and long short-term memory (LSTM) integrated with Bayesian optimizer. The spatial distribution of the annual average PM2.5 revealed the population of the whole of Iraq is exposed to a pollution level above the standard limit. The changes in temperature and soil moisture and the mean wind speed and humidity of the month before the early summer can predict the temporal and spatial variability of PM2.5 over Iraq during May-July. Results revealed the higher performance of LSTM with normalized root-mean-square error and Kling-Gupta efficiency of 13.4% and 0.89, compared to 16.02% and 0.81 for SDG-BP and 17.9% and 0.74 for ERT. The LSTM could also reconstruct the observed spatial distribution of PM2.5 with MapCurve and Cramer's V values of 0.95 and 0.91, compared to 0.9 and 0.86 for SGD-BP and 0.83 and 0.76 for ERT. The study provided a methodology for forecasting spatial variability of PM2.5 concentration at high resolution during the peak pollution months from freely available data, which can be replicated in other regions for generating high-resolution PM2.5 forecasting maps. © 2023 The Authors Elsevier Ltd 1604120 English Article All Open Access; Gold Open Access |
author |
Tao H.; Jawad A.H.; Shather A.H.; Al-Khafaji Z.; Rashid T.A.; Ali M.; Al-Ansari N.; Marhoon H.A.; Shahid S.; Yaseen Z.M. |
spellingShingle |
Tao H.; Jawad A.H.; Shather A.H.; Al-Khafaji Z.; Rashid T.A.; Ali M.; Al-Ansari N.; Marhoon H.A.; Shahid S.; Yaseen Z.M. Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters |
author_facet |
Tao H.; Jawad A.H.; Shather A.H.; Al-Khafaji Z.; Rashid T.A.; Ali M.; Al-Ansari N.; Marhoon H.A.; Shahid S.; Yaseen Z.M. |
author_sort |
Tao H.; Jawad A.H.; Shather A.H.; Al-Khafaji Z.; Rashid T.A.; Ali M.; Al-Ansari N.; Marhoon H.A.; Shahid S.; Yaseen Z.M. |
title |
Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters |
title_short |
Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters |
title_full |
Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters |
title_fullStr |
Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters |
title_full_unstemmed |
Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters |
title_sort |
Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters |
publishDate |
2023 |
container_title |
Environment International |
container_volume |
175 |
container_issue |
|
doi_str_mv |
10.1016/j.envint.2023.107931 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85153537218&doi=10.1016%2fj.envint.2023.107931&partnerID=40&md5=91f7aa5b1ffbb13a2eae2389f70ff7ea |
description |
This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the changing patterns of four European Reanalysis (ERA5) meteorological variables, rainfall, mean temperature, wind speed and relative humidity, and one soil parameter, the soil moisture, were used to select the suitable set of predictors using a non-greedy algorithm known as simulated annealing (SA). The selected predictors were used to simulate the temporal and spatial variability of air PM2.5 concentration over Iraq during the early summer (May-July), the most polluted months, using three advanced ML models, extremely randomized trees (ERT), stochastic gradient descent backpropagation (SGD-BP) and long short-term memory (LSTM) integrated with Bayesian optimizer. The spatial distribution of the annual average PM2.5 revealed the population of the whole of Iraq is exposed to a pollution level above the standard limit. The changes in temperature and soil moisture and the mean wind speed and humidity of the month before the early summer can predict the temporal and spatial variability of PM2.5 over Iraq during May-July. Results revealed the higher performance of LSTM with normalized root-mean-square error and Kling-Gupta efficiency of 13.4% and 0.89, compared to 16.02% and 0.81 for SDG-BP and 17.9% and 0.74 for ERT. The LSTM could also reconstruct the observed spatial distribution of PM2.5 with MapCurve and Cramer's V values of 0.95 and 0.91, compared to 0.9 and 0.86 for SGD-BP and 0.83 and 0.76 for ERT. The study provided a methodology for forecasting spatial variability of PM2.5 concentration at high resolution during the peak pollution months from freely available data, which can be replicated in other regions for generating high-resolution PM2.5 forecasting maps. © 2023 The Authors |
publisher |
Elsevier Ltd |
issn |
1604120 |
language |
English |
format |
Article |
accesstype |
All Open Access; Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809678018207023104 |