Title
Classification Models And Survival Analysis For Prostate Cancer Using Rna Sequencing And Clinical Data
Abstract
Early detection of cancer can significantly increase the chance of successful treatment. This research performs a study on early cancer detection for prostate cancer patients from whom cancer tissue was analyzed with Illumina Hi-Seq ribonucleic acid (RNA) Sequencing (RNA-Seq). Cancer relevant genes with the most significant correlations with the clinical outcome of the sample type (cancer / non-cancer) and the overall survival (OS) were assessed. Traditional cancer diagnosis primarily depends on physicians' experience to identify morphological abnormalities. Gene expression level data can assist physicians in detecting cancer cases at a much earlier stage and thus can significantly improve the potential of patient treatment. In this research, for the classification task, we applied machine learning and data mining approaches to detect cancer versus non-cancer based on gene expression data. Our goal was to detect cancer at the earliest stage. Besides, for the regression task, survival outcomes in prostate cancer patients were performed. Regression trees were built using cancer-sensitive genes along with clinical attribute 'Gleason score' as predictors, and the clinical variable `overall survival' as the target variable. Knowledge in the form of rules is one of the vital tasks in data mining as it provides concise statements of easily understandable and potentially valuable information. For the classification model, we derived rules from a decision tree and interpreted these rules for cancer and non-cancer patients. For the regression or survival model, we generated rules for predicting or estimating the survival time of cancer patients. In this study, cancer-relevant genes were analyzed as predictors, although various genes may interact with genes currently known to contribute to cancer. These findings have implications for assessing gene-gene interactions and gene-environment interactions of prostate cancer as well as for other types of cancer.
Year
DOI
Venue
2019
10.1109/BigData47090.2019.9006036
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
Keywords
Field
DocType
data mining, RNA Sequencing, Genomic data Commons, Prostate Cancer, rules generation, Survival Analysis
Decision tree,RNA,Computer science,Sample Type,Prostate cancer,Artificial intelligence,Oncology,Regression,Internal medicine,Survival analysis,Cancer,Machine learning,Patient treatment
Conference
ISSN
Citations 
PageRank 
2639-1589
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Md Faisal Kabir101.01
Simone A Ludwig21309179.41