On sensitivity of case-based reasoning to optimal feature subsets in business failure prediction - Citegraph

Paper Info

Title
On sensitivity of case-based reasoning to optimal feature subsets in business failure prediction

Abstract
Case-based reasoning (CBR) was firstly introduced into the area of business failure prediction (BFP) in 1996. The conclusion drawn out in its first application in this area is that CBR is not more applicable than multiple discriminant analysis (MDA) and Logit. On the contrary, there are some arguments which claim that CBR with k-nearest neighbor (k-NN) as its heart is not surely outranked by those machine learning techniques. In this research, we attempt to investigate whether or not CBR is sensitive to the so-called optimal feature subsets in BFP, since feature subset is an important factor that accounts for CBR's performance. When CBR is used to solve such classification problem, the retrieval process of its life-cycle is mainly used. We use the classical Euclidean metric technique to calculate case similarity. Empirical data two years prior to failure are collected from Shanghai Stock Exchange and Shenzhen Stock Exchange in China. Four filters, i.e. MDA stepwise method, Logit stepwise method, One-way ANOVA, independent-samples t-test, and the wrapper approach of genetic algorithm are employed to generate five optimal feature subsets after data normalization. Thirty-times hold-out method is used as assessment of predictive performances by combining leave-one-out cross-validation and hold-out method. The two statistical baseline models, i.e. MDA and Logit, and the new model of support vector machine are employed as comparative models. Empirical results indicate that CBR is truly sensitive to optimal feature subsets with data for medium-term BFP. The stepwise method of MDA, a filter approach, is the first choice for CBR to select optimal feature subsets, followed by the stepwise method of Logit and the wrapper. The two filter approaches of ANOVA and t-test are the fourth choice. If MDA stepwise method is employed to select optimal feature subset for the CBR system, there are no significant difference on predictive performance of medium-term BFP between CBR and the other three models, i.e. MDA, Logit, SVM. On the contrary, CBR is outperformed by the three models at the significant level of 1%, if ANOVA or t-test is used as feature selection method for CBR.

Year	DOI	Venue
2010	10.1016/j.eswa.2009.12.034	Expert Syst. Appl.
Keywords	Field	DocType
optimal feature subsets,cbr system,case-based reasoning (cbr),feature selection method,business failure prediction,chinese listed company,filters,mda stepwise method,wrappers,business failure prediction (bfp),feature selection,filter approach,k -nearest neighbor,logit stepwise method,medium-term bfp,thirty-times hold-out method,case-based reasoning,stepwise method,predictive performance,stock exchange,case base reasoning,leave one out cross validation,genetic algorithm,machine learning,life cycle,multiple discriminant analysis,k nearest neighbor,comparative modeling,support vector machine	Logit,k-nearest neighbors algorithm,Data mining,Feature selection,Computer science,Multiple discriminant analysis,Support vector machine,Euclidean distance,Artificial intelligence,Case-based reasoning,Machine learning,Database normalization	Journal
Volume	Issue	ISSN
37	7	Expert Systems With Applications
Citations	PageRank	References
6	0.42	44
Authors
4

Authors (4 rows)

Cited by (6 rows)

References (44 rows)

Name	Order	Citations	PageRank
Hui Li	1	472	15.82
Hai-Bin Huang	2	25	7.59
Jie Sun	3	374	12.21
Chuang Lin	4	3040	390.74

1