Abstract | ||
---|---|---|
Data sets with many discrete variables and relatively few cases arise in health care, e-commerce, information security, text mining, and many other domains. Learning effective and efficient prediction models from such data sets is a challenging task. In this paper, we propose a tabu search-enhanced Markov blanket (TS/MB) algorithm to learn a graphical Markov blanket model for classification of high-dimensional data sets. The TS/MB algorithm makes use of Markov blanket neighborhoods: restricted neighborhoods in a general Bayesian network based on the Markov condition. Computational results from real-world data sets drawn from several domains indicate that the TS/MB algorithm, when used as a feature selection method, is able to find a parsimonious model with substantially fewer predictor variables than is present in the full data set. The algorithm also provides good prediction performance when used as a graphical classifier compared with several machine-learning methods. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1287/ijoc.1070.0255 | INFORMS Journal on Computing |
Keywords | Field | DocType |
high dimensions,real-world data,markov blanket neighborhood,mb algorithm,tabu search-enhanced graphical models,graphical markov blanket model,full data,markov condition,efficient prediction model,markov blanket,high-dimensional data set,text analysis,bayesian networks,graphical model,online marketing,tabu search,machine learning | Data mining,Maximum-entropy Markov model,Feature selection,Computer science,Artificial intelligence,Markov blanket,Mathematical optimization,Markov model,Bayesian network,Graphical model,Causal Markov condition,Machine learning,Tabu search | Journal |
Volume | Issue | ISSN |
20 | 3 | 1091-9856 |
Citations | PageRank | References |
8 | 0.53 | 23 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xue Bai | 1 | 8 | 0.53 |
Rema Padman | 2 | 365 | 57.71 |
Joseph D. Ramsey | 3 | 567 | 33.56 |
Peter Spirtes | 4 | 616 | 101.07 |