The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Metadata Harvesting (OAI2)
  • Digital Archiving Policy
  • Promote your Publication

IJACSA

  • About the Journal
  • Call for Papers
  • Author Guidelines
  • Fees/ APC
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Editors
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Indexing
  • Submit your Paper
  • Guidelines
  • Fees
  • Current Issue
  • Archives
  • Editors
  • Reviewers
  • Subscribe

DOI: 10.14569/IJACSA.2020.0110254

Towards a Powerful Solution for Data Accuracy Assessment in the Big Data Context

Author 1: Mohamed TALHA
Author 2: Nabil ELMARZOUQI
Author 3: Anas ABOU EL KALAM

PDF

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 11 Issue 2, 2020.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Data Accuracy is one of the main dimensions of Data Quality; it measures the degree to which data are correct. Knowing the accuracy of an organization's data reflects the level of reliability it can assign to them in decision-making processes. Measuring data accuracy in Big Data environment is a process that involves comparing data to assess with some "reference data" considered by the system to be correct. However, such a process can be complex or even impossible in the absence of appropriate reference data. In this paper, we focus on this problem and propose an approach to obtain the reference data thanks to the emergence of Big Data technologies. Our approach is based on the upstream selection of a set of criteria that we define as "Accuracy Criteria". We use furthermore a set of techniques such as Big Data Sampling, Schema Matching, Record Linkage, and Similarity Measurement. The proposed model and experiment results allow us to be more confident in the importance of data quality assessment solution and the configuration of the accuracy criteria to automate the selection of reference data in a Data Lake.

Keywords: Big data; data quality; data accuracy assessment; big data sampling; schema matching; record linkage; similarity measurement

Mohamed TALHA, Nabil ELMARZOUQI and Anas ABOU EL KALAM, “Towards a Powerful Solution for Data Accuracy Assessment in the Big Data Context” International Journal of Advanced Computer Science and Applications(IJACSA), 11(2), 2020. http://dx.doi.org/10.14569/IJACSA.2020.0110254

@article{TALHA2020,
title = {Towards a Powerful Solution for Data Accuracy Assessment in the Big Data Context},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2020.0110254},
url = {http://dx.doi.org/10.14569/IJACSA.2020.0110254},
year = {2020},
publisher = {The Science and Information Organization},
volume = {11},
number = {2},
author = {Mohamed TALHA and Nabil ELMARZOUQI and Anas ABOU EL KALAM}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

Future of Information and Communication Conference (FICC) 2024

4-5 April 2024

  • Berlin, Germany

Computing Conference 2024

11-12 July 2024

  • London, United Kingdom

IntelliSys 2023

7-8 September 2023

  • Amsterdam, The Netherlands

Future Technologies Conference (FTC) 2023

2-3 November 2023

  • San Francisco, United States
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org