Missing Data Filling Algorithm for Big Data-Based Map-Reduce Technology
Title
Missing Data Filling Algorithm for Big Data-Based Map-Reduce Technology
Subject
Bayesian networks
Inference engines
Data handling
Data mining
Classification (of information)
MapReduce
Probabilistic logics
Query processing
Description
In big data, the large number of missing values is a serious problem to compute the correct decision. This problem seriously affects the quality of information query, distorts data mining and analysis, and misleads the decisions. Therefore, in order to solve the missing values in the real database, the authors have prepopulated the missing data and filled in the classification attributes based on the probabilistic reasoning. The reasoning process is completed in Bayesian network to realize the parallelization of big data processing. The proposed algorithm has been presented in the MapReduce framework. The experimental results show that the Bayesian network construction method and probabilistic inference are effective for the classification data processing and the parallelism of algorithm in Hadoop. Copyright 2022, IGI Global.
2
18
Creator
Li, Fugui
Sharma, Ashutosh
Publisher
International Journal of e-Collaboration
Date
2022
Type
journalArticle
Identifier
15483673
10.4018/IJeC.304036
Citation
Li, Fugui and Sharma, Ashutosh, “Missing Data Filling Algorithm for Big Data-Based Map-Reduce Technology,” Lamar University Midstream Center Research, accessed May 4, 2024, https://lumc.omeka.net/items/show/29395.