Application of functional data analysis for the treatment of missing air quality data

In most research including environmental research, missing recorded data often exists and has become a common problem for data quality. In this study, several imputation methods that have been designed based on the techniques for functional data analysis are introduced and the capability of the meth...

Full description

Bibliographic Details
Published in:Sains Malaysiana
Main Author: Shaadan N.; Deni S.M.; Jemain A.A.
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2015
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-84952043609&doi=10.17576%2fjsm-2015-4410-19&partnerID=40&md5=5bab3eb018e5f5bfeaca2c37b86eb086
id 2-s2.0-84952043609
spelling 2-s2.0-84952043609
Shaadan N.; Deni S.M.; Jemain A.A.
Application of functional data analysis for the treatment of missing air quality data
2015
Sains Malaysiana
44
10
10.17576/jsm-2015-4410-19
https://www.scopus.com/inward/record.uri?eid=2-s2.0-84952043609&doi=10.17576%2fjsm-2015-4410-19&partnerID=40&md5=5bab3eb018e5f5bfeaca2c37b86eb086
In most research including environmental research, missing recorded data often exists and has become a common problem for data quality. In this study, several imputation methods that have been designed based on the techniques for functional data analysis are introduced and the capability of the methods for estimating missing values is investigated. Single imputation methods and iterative imputation methods are conducted by means of curve estimation using regression and roughness penalty smoothing approaches. The performance of the methods is compared using a reference data set, the real PM10 data from an air quality monitoring station namely the Petaling Jaya station located at the western part of Peninsular Malaysia. A hundred of the missing data sets that have been generated from a reference data set with six different patterns of missing values are used to investigate the performance of the considered methods. The patterns are simulated according to three percentages (5, 10 and 15) of missing values with respect to two different sizes (3 and 7) of maximum gap lengths (consecutive missing points). By means of the mean absolute error, the index of agreement and the coefficient of determination as the performance indicators, the results have showed that the iterative imputation method using the roughness penalty approach is more flexible and superior to other methods.
Penerbit Universiti Kebangsaan Malaysia
1266039
English
Article
All Open Access; Gold Open Access
author Shaadan N.; Deni S.M.; Jemain A.A.
spellingShingle Shaadan N.; Deni S.M.; Jemain A.A.
Application of functional data analysis for the treatment of missing air quality data
author_facet Shaadan N.; Deni S.M.; Jemain A.A.
author_sort Shaadan N.; Deni S.M.; Jemain A.A.
title Application of functional data analysis for the treatment of missing air quality data
title_short Application of functional data analysis for the treatment of missing air quality data
title_full Application of functional data analysis for the treatment of missing air quality data
title_fullStr Application of functional data analysis for the treatment of missing air quality data
title_full_unstemmed Application of functional data analysis for the treatment of missing air quality data
title_sort Application of functional data analysis for the treatment of missing air quality data
publishDate 2015
container_title Sains Malaysiana
container_volume 44
container_issue 10
doi_str_mv 10.17576/jsm-2015-4410-19
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-84952043609&doi=10.17576%2fjsm-2015-4410-19&partnerID=40&md5=5bab3eb018e5f5bfeaca2c37b86eb086
description In most research including environmental research, missing recorded data often exists and has become a common problem for data quality. In this study, several imputation methods that have been designed based on the techniques for functional data analysis are introduced and the capability of the methods for estimating missing values is investigated. Single imputation methods and iterative imputation methods are conducted by means of curve estimation using regression and roughness penalty smoothing approaches. The performance of the methods is compared using a reference data set, the real PM10 data from an air quality monitoring station namely the Petaling Jaya station located at the western part of Peninsular Malaysia. A hundred of the missing data sets that have been generated from a reference data set with six different patterns of missing values are used to investigate the performance of the considered methods. The patterns are simulated according to three percentages (5, 10 and 15) of missing values with respect to two different sizes (3 and 7) of maximum gap lengths (consecutive missing points). By means of the mean absolute error, the index of agreement and the coefficient of determination as the performance indicators, the results have showed that the iterative imputation method using the roughness penalty approach is more flexible and superior to other methods.
publisher Penerbit Universiti Kebangsaan Malaysia
issn 1266039
language English
format Article
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1809677911001661440