Research on Data Storage and Processing Optimization Based on Federation HDFS and Spark. - Citegraph

Paper Info

Title
Research on Data Storage and Processing Optimization Based on Federation HDFS and Spark.

Abstract
Hadoop and Spark provide undifferentiated services for data storage and processing, which can make it unable to meet on-demand services of different users or different types of data. Based on the above situation, this paper proposes a system architecture for data storage optimization based on Federation HDFS and Spark. According to Naive Bayes algorithm, the data of different types or different users received are divided. The divided results are stored in Federation HDFS with different backup policies and Spark is used to process data according to the priority at the same time. Based on the method described above, differential service can be realized and service quality can be improved. The experimental results show that the data storage and processing system architecture can provide different storage strategies and processing priorities for different priority data, which can also provide high fault tolerance and reduce data processing delay for high priority data.

Year	DOI	Venue
2018	10.1007/978-3-319-93659-8_97	COMPLEX, INTELLIGENT, AND SOFTWARE INTENSIVE SYSTEMS
Field	DocType	Volume
Data processing,Spark (mathematics),Naive Bayes classifier,Computer data storage,Computer science,Data type,Fault tolerance,Systems architecture,Backup,Distributed computing	Conference	772
ISSN	Citations	PageRank
2194-5357	0	0.34
References	Authors
15	4

Authors (4 rows)

Cited by (0 rows)

References (15 rows)

Name	Order	Citations	PageRank
Fangzhou Chen	1	0	1.01
Peng Li	2	1	2.43
He Xu	3	47	12.76
Wenkang Xie	4	0	0.68

1