Initiating Data Quality: A Dynamic Rule-Based System for Detecting Errors in Data

In the contemporary technological landscape, reliance on data insights is commonplace for informed decision-making. The significance arises from the data's ability to unveil factual information, providing valuable guidance. However, the accuracy of these insights is inherently tied to the quali...

Full description

Bibliographic Details
Published in:2023 IEEE 11th Conference on Systems, Process and Control, ICSPC 2023 - Proceedings
Main Author: Zaini N.; Seman M.R.; Ismail A.N.; Majang B.C.; Fadhilah S.A.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2023
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85186688465&doi=10.1109%2fICSPC59664.2023.10420385&partnerID=40&md5=b36e93c536d54c4351cd465ffcfc26de
Description
Summary:In the contemporary technological landscape, reliance on data insights is commonplace for informed decision-making. The significance arises from the data's ability to unveil factual information, providing valuable guidance. However, the accuracy of these insights is inherently tied to the quality of the data. Ensuring high data quality is crucial for deriving precise insights. Despite accumulating and storing vast amounts of data, not all of it meets the standard for high quality, often harboring numerous issues. Within this context, the study aims to explore initial steps towards improving data quality by first implementing automatic detection of errors within datasets. Towards this main goal, the study outlines three primary objectives: firstly, to identify prevalent data-related issues based on recurrent errors; secondly, to devise effective methods for translating recurrent data issues into seamlessly integrated rules for automated detection; and finally, to investigate the most effective approach for routine error checks. All of these objectives will be attempted to be developed and integrated as part of a system. This exploration aims, in the end, for the system to be able to generate comprehensive issue reports with each iteration of error checking, ready for the next step towards enhancing data quality. © 2023 IEEE.
ISSN:
DOI:10.1109/ICSPC59664.2023.10420385