Summary: | In the contemporary technological landscape, reliance on data insights is commonplace for informed decision-making. The significance arises from the data's ability to unveil factual information, providing valuable guidance. However, the accuracy of these insights is inherently tied to the quality of the data. Ensuring high data quality is crucial for deriving precise insights. Despite accumulating and storing vast amounts of data, not all of it meets the standard for high quality, often harboring numerous issues. Within this context, the study aims to explore initial steps towards improving data quality by first implementing automatic detection of errors within datasets. Towards this main goal, the study outlines three primary objectives: firstly, to identify prevalent data-related issues based on recurrent errors; secondly, to devise effective methods for translating recurrent data issues into seamlessly integrated rules for automated detection; and finally, to investigate the most effective approach for routine error checks. All of these objectives will be attempted to be developed and integrated as part of a system. This exploration aims, in the end, for the system to be able to generate comprehensive issue reports with each iteration of error checking, ready for the next step towards enhancing data quality. © 2023 IEEE.
|