Finding Essential Attributes from Binary Data - Citegraph

Paper Info

Title
Finding Essential Attributes from Binary Data

Abstract
We consider data sets that consist of n-dimensional binary vectors representing positive and negative examples for some (possibly unknown) phenomenon. A subset S of the attributes (or variables) of such a data set is called a support set if the positive and negative examples can be distinguished by using only the attributes in S. In this paper we study the problem of finding small support sets, a frequently arising task in various fields, including knowledge discovery, data mining, learning theory, logical analysis of data, etc. We study the distribution of support sets in randomly generated data, and discuss why finding small support sets is important. We propose several measures of separation (real valued set functions over the subsets of attributes), formulate optimization models for finding the smallest subsets maximizing these measures, and devise efficient heuristic algorithms to solve these (typically NP-hard) optimization problems. We prove that several of the proposed heuristics have a guaranteed constant approximation ratio, and we report on computational experience comparing these heuristics with some others from the literature both on randomly generated and on real world data sets.

Year	DOI	Venue
2003	10.1023/A:1024653703689	Ann. Math. Artif. Intell.
Keywords	Field	DocType
feature selection,essential attributes,support sets	Set function,Data mining,Heuristic,Data set,Association rule learning,Heuristics,Knowledge extraction,Artificial intelligence,Binary data,Optimization problem,Mathematics,Machine learning	Journal
Volume	Issue	ISSN
39	3	1573-7470
Citations	PageRank	References
8	0.54	31
Authors
5

Authors (5 rows)

Cited by (8 rows)

References (31 rows)

Name	Order	Citations	PageRank
Endre Boros	1	1779	155.63
Takashi Horiyama	2	81	19.76
Toshihide Ibaraki	3	2593	385.64
Kazuhisa Makino	4	1088	102.74
Mutsunori Yagiura	5	614	46.59

1