Title
High-performance medical data processing technology based on distributed parallel machine learning algorithm
Abstract
The aim is to improve the efficiency of medical data processing and establish a sound medical data management system. To apply distributed parallel classification algorithms in the field of hospital intelligent guidance, a Parallel Random Forest (PRF) classification algorithm is proposed based on the Apache Spark cloud computing platform. Given sparse cluster loss in variable density distribution data sets, an Adaptive Domain Density Peak Clustering (ADDPC) method is proposed. Here, a Bilayer Parallel Training-Convolutional Neural Network (BPT-CNN) model based on distributed computing is proposed to detect and classify colon cancer nuclei more accurately through the large-scale parallel deep learning (DL) algorithm. Then, the performance of the proposed model is evaluated through case analysis. The results show that the PRF algorithm based on distributed cloud computing platform can independently design data-parallel tasks, thereby optimizing the data communication cost and efficiency. ADDPC algorithm can adaptively measure domain density and merge sparse clusters to prevent data loss and fragmentation. The BPT-CNN model improves the performance of the algorithm and balances the workload of each task in the algorithm. The results have a significant reference value for solving problems in medical data processing.
Year
DOI
Venue
2022
10.1007/s11227-021-04060-4
The Journal of Supercomputing
Keywords
DocType
Volume
Adaptive density peak clustering algorithm, Random forest algorithm, Distributed parallel classification algorithm, Cloud computing
Journal
78
Issue
ISSN
Citations 
4
0920-8542
0
PageRank 
References 
Authors
0.34
17
4
Name
Order
Citations
PageRank
Ji Liu100.34
Xiao Liang200.34
Wenxi Ruan300.34
Bo Zhang4419.80