Evaluating the effectiveness of data quality framework in software engineering
The quality of data is important in research working with data sets because poor data quality may lead to invalid results. Data sets contain measurements that are associated with metrics and entities; however, in some data sets, it is not always clear which entities have been measured and exactly wh...
Published in: | International Journal of Electrical and Computer Engineering |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Institute of Advanced Engineering and Science
2022
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139005826&doi=10.11591%2fijece.v12i6.pp6410-6422&partnerID=40&md5=2d487912d6fb56b49dc3504fd5bf9cdd |
id |
2-s2.0-85139005826 |
---|---|
spelling |
2-s2.0-85139005826 Rosli M.M.; Yusop N.S.M. Evaluating the effectiveness of data quality framework in software engineering 2022 International Journal of Electrical and Computer Engineering 12 6 10.11591/ijece.v12i6.pp6410-6422 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139005826&doi=10.11591%2fijece.v12i6.pp6410-6422&partnerID=40&md5=2d487912d6fb56b49dc3504fd5bf9cdd The quality of data is important in research working with data sets because poor data quality may lead to invalid results. Data sets contain measurements that are associated with metrics and entities; however, in some data sets, it is not always clear which entities have been measured and exactly which metrics have been used. This means that measurements could be misinterpreted. In this study, we develop a framework for data quality assessment that determines whether a data set has sufficient information to support the correct interpretation of data for analysis in empirical research. The framework incorporates a dataset metamodel and a quality assessment process to evaluate the data set quality. To evaluate the effectiveness of our framework, we conducted a user study. We used observations, a questionnaire and think aloud approach to provide insights into the framework through participant thought processes while applying the framework. The results of our study provide evidence that most participants successfully applied the definitions of dataset category elements and the formal definitions of data quality issues to the datasets. Further work is needed to reproduce our results with more participants, and to determine whether the data quality framework is generalizable to other types of data sets. © 2022 Institute of Advanced Engineering and Science. All rights reserved. Institute of Advanced Engineering and Science 20888708 English Article All Open Access; Gold Open Access |
author |
Rosli M.M.; Yusop N.S.M. |
spellingShingle |
Rosli M.M.; Yusop N.S.M. Evaluating the effectiveness of data quality framework in software engineering |
author_facet |
Rosli M.M.; Yusop N.S.M. |
author_sort |
Rosli M.M.; Yusop N.S.M. |
title |
Evaluating the effectiveness of data quality framework in software engineering |
title_short |
Evaluating the effectiveness of data quality framework in software engineering |
title_full |
Evaluating the effectiveness of data quality framework in software engineering |
title_fullStr |
Evaluating the effectiveness of data quality framework in software engineering |
title_full_unstemmed |
Evaluating the effectiveness of data quality framework in software engineering |
title_sort |
Evaluating the effectiveness of data quality framework in software engineering |
publishDate |
2022 |
container_title |
International Journal of Electrical and Computer Engineering |
container_volume |
12 |
container_issue |
6 |
doi_str_mv |
10.11591/ijece.v12i6.pp6410-6422 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139005826&doi=10.11591%2fijece.v12i6.pp6410-6422&partnerID=40&md5=2d487912d6fb56b49dc3504fd5bf9cdd |
description |
The quality of data is important in research working with data sets because poor data quality may lead to invalid results. Data sets contain measurements that are associated with metrics and entities; however, in some data sets, it is not always clear which entities have been measured and exactly which metrics have been used. This means that measurements could be misinterpreted. In this study, we develop a framework for data quality assessment that determines whether a data set has sufficient information to support the correct interpretation of data for analysis in empirical research. The framework incorporates a dataset metamodel and a quality assessment process to evaluate the data set quality. To evaluate the effectiveness of our framework, we conducted a user study. We used observations, a questionnaire and think aloud approach to provide insights into the framework through participant thought processes while applying the framework. The results of our study provide evidence that most participants successfully applied the definitions of dataset category elements and the formal definitions of data quality issues to the datasets. Further work is needed to reproduce our results with more participants, and to determine whether the data quality framework is generalizable to other types of data sets. © 2022 Institute of Advanced Engineering and Science. All rights reserved. |
publisher |
Institute of Advanced Engineering and Science |
issn |
20888708 |
language |
English |
format |
Article |
accesstype |
All Open Access; Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809678157340475392 |