Title
Data Mining of Mass Storage Based on Cloud Computing
Abstract
Cloud computing is an elastic computing model that the users can lease the resources from the rentable infrastructure. Cloud computing is gaining popularity due to its lower cost, high reliability and huge availability. To utilize the powerful and huge capability of cloud computing, this paper is to import it into data mining and machine learning field. As one of the most influential and open competition in machine learning area, Netflix Prize attached with mass storage had driven thousands of teams across the world to attack the problem, among which the final winner was BellKor's Pragmatic Chaos team, who bested Netflix's own algorithm for predicting ratings by 10%. Their solution is an ensemble of a large number of models, each of which specializes in addressing a different aspect of the data. Among such different models, k-nearest neighbors (KNN) and Restricted Boltzmann Machine (RBM) are reported to be two most important and successful models. As a result, we build two predictors based on such two model respectively with the order to testify their performance based on cloud computing platforms. The results show that KNN can achieve root mean square deviation (rmse) with 0:9468 after the Global Effect (GE) data preprocessing, which is better than the Cinematch's performance with rmse being 0:951. The rmse for RBM algorithm is about 0:9670 on the raw dataset, which can be further improved by KNN model.
Year
DOI
Venue
2010
10.1109/GCC.2010.89
GCC
Keywords
Field
DocType
restricted boltzmann machine,root mean square deviation,boltzmann machines,k-nearest neighbors,different aspect,knn model,learning (artificial intelligence),elastic computing model,netflix prize,cloud computing platform,machine learning field,rbm algorithm,data mining,cloud computing,successful model,mass storage,different model,computational modeling,computer model,k nearest neighbor,correlation,data preprocessing,k nearest neighbors,boltzmann machine,learning artificial intelligence,machine learning,motion pictures
k-nearest neighbors algorithm,Data mining,Restricted Boltzmann machine,Computer science,Mean squared error,Data pre-processing,Root-mean-square deviation,Artificial intelligence,Machine learning,Cloud computing,Mass storage
Conference
ISBN
Citations 
PageRank 
978-0-7695-4313-0
5
0.67
References 
Authors
3
4
Name
Order
Citations
PageRank
Jianzong Wang16134.65
Jiguang Wan2299.71
Zhuo Liu3249.39
Peng Wang45010.52