Missing Data Filling Algorithm for Big Data-Based Map-Reduce Technology

Title

Missing Data Filling Algorithm for Big Data-Based Map-Reduce Technology

Subject

Bayesian networks
Inference engines
Data handling
Data mining
Classification (of information)
MapReduce
Probabilistic logics
Query processing

Description

In big data, the large number of missing values is a serious problem to compute the correct decision. This problem seriously affects the quality of information query, distorts data mining and analysis, and misleads the decisions. Therefore, in order to solve the missing values in the real database, the authors have prepopulated the missing data and filled in the classification attributes based on the probabilistic reasoning. The reasoning process is completed in Bayesian network to realize the parallelization of big data processing. The proposed algorithm has been presented in the MapReduce framework. The experimental results show that the Bayesian network construction method and probabilistic inference are effective for the classification data processing and the parallelism of algorithm in Hadoop. Copyright 2022, IGI Global.
2
18

Creator

Li, Fugui
Sharma, Ashutosh

Publisher

International Journal of e-Collaboration

Date

2022

Type

journalArticle

Identifier

15483673
10.4018/IJeC.304036

Citation

Li, Fugui and Sharma, Ashutosh, “Missing Data Filling Algorithm for Big Data-Based Map-Reduce Technology,” Lamar University Midstream Center Research, accessed May 4, 2024, https://lumc.omeka.net/items/show/29395.

Output Formats