Title
Data Mining and Model Simplicity: A Case Study in Diagnosis
Abstract
We describe the results of performing data min- ing on a challenging medical diagnosis domain, acute abdominal pain. This domain is well known to be difficult, yielding little more than 60% pre- dictive accuracy for most human and machine di- agnosticians. Moreover, many researchers argue that one of the simplest approaches, the naive Bayesian classifier, is optimal. By comparing the performance of the naive Bayesian classifier to its more general cousin, the Bayesian network clas- sifter, and to selective Bayesian classifiers with just 10% of the total attributes, we show that the simplest models perform at least as well as the more complex models. We argue that simple models like the selective naive Bayesian classifier will perform as well as more complicated mod- els for similarly complex domains with relatively small data sets, thereby calling into question the extra expense necessary to induce more complex models.
Year
Venue
Keywords
1996
KDD
bayesian classifier,bayesian network,medical diagnosis,data mining
Field
DocType
Citations 
Data mining,Variable-order Bayesian network,Small data,Computer science,Bayesian network classifier,Artificial intelligence,Medical diagnosis,Machine learning,Bayesian probability,Naive bayesian classifier
Conference
4
PageRank 
References 
Authors
0.57
7
2
Name
Order
Citations
PageRank
gregory provan1503120.02
Moninder Singh2381105.12