Big Data Analytics Using Multi-Classifier Approach with Rhadoop

Title

Big Data Analytics Using Multi-Classifier Approach with Rhadoop

Subject

Big data
Nearest neighbor search
Digital storage
Application programs
Decision trees
Data mining
Open source software
Visualization
Metadata
Data visualization
Data Analytics
Classification (of information)
Text processing
Motion compensation

Description

Big Data is the massive amount of data that is generated at such a high speed that is very difficult to analyze with traditional tools. Hadoop provides distributed storage and processing, to extract useful information from such huge data. On the other hand, R is open-source data analysis and programming language that facilitates statistical analysis and data visualization. But R is not scalable, it becomes difficult to process big data using R due to its memory limitations. To utilize data visualization, data transformation capabilities of R on Big Data, in this paper we have integrated R with Hadoop using RHadoop[] package and implemented map reduce form of K-Nearest Neighbor, Naive Bayes and Decision Tree Classifiers in R. In this paper we have also implemented Multi-Classifier to improve the accuracy of classification. Multi-Classifier combines the power of individual classifier to increase the eciency and accuracy of classication. We have used Bayesian combinatorial function and majority voting to combine powers of the above mentioned classifiers. We have found that Multi-Classifier approach gives an improvement in parameters like precision, recall and accuracy. 2018 IEEE.
478-484

Creator

Hiranandani, Priyanka
Pilli, Emmanuel S.
Chand, Nanak
Ramakrishna, C.
Gupta, Madhuri

Publisher

8th International Conference Confluence on Cloud Computing, Data Science and Engineering, Confluence 2018, January 11, 2018 - January 12, 2018

Date

2018

Type

conferencePaper

Identifier

10.1109/CONFLUENCE.2018.8442876

Citation

Hiranandani, Priyanka et al., “Big Data Analytics Using Multi-Classifier Approach with Rhadoop,” Lamar University Midstream Center Research, accessed May 18, 2024, https://lumc.omeka.net/items/show/28302.

Output Formats