Learning to identify relevant studies for systematic reviews using random forest and external information - Citegraph

Paper Info

Title
Learning to identify relevant studies for systematic reviews using random forest and external information

Abstract
We tackle the problem of automatically filtering studies while preparing Systematic Reviews (SRs) which normally entails manually inspecting thousands of studies to identify the few to be included. The problem is modeled as an imbalanced data classification task where the cost of misclassifying the minority class is higher than the cost of misclassifying the majority class. This work introduces a novel method for representing systematic reviews based not only on lexical features, but also utilizing word clustering and citation features. This novel representation is shown to outperform previously used features in representing systematic reviews, regardless of the classifier. Our work utilizes a random forest classifier with the novel features to accurately predict included studies with high recall. The parameters of the random forest are automatically configured using heuristics methods thus allowing us to provide a product that is usable in real scenarios. Experiments on a dataset containing 15 systematic reviews that were prepared by health care professionals show that our approach can achieve high recall while helping the SR author save time.

Year	DOI	Venue
2016	10.1007/s10994-015-5535-7	Machine Learning
Keywords	Field	DocType
Systematic review,Classification,Inclusion prediction	USable,Data mining,Systematic review,Computer science,Filter (signal processing),Heuristics,Artificial intelligence,Data classification,Classifier (linguistics),Cluster analysis,Random forest,Machine learning	Journal
Volume	Issue	ISSN
102	3	0885-6125
Citations	PageRank	References
7	0.73	18
Authors
5

Authors (5 rows)

Cited by (7 rows)

References (18 rows)

Name	Order	Citations	PageRank
Madian Khabsa	1	237	18.81
Ahmed K. Elmagarmid	2	3720	626.92
Ihab F. Ilyas	3	2907	117.27
Hossam M. Hammady	4	41	2.78
Mourad Ouzzani	5	1213	120.36

1