Abstract | ||
---|---|---|
Discriminative patterns are association patterns that occur with
disproportionate frequency in some classes versus others, and have been studied
under names such as emerging patterns and contrast sets. Such patterns have
demonstrated considerable value for classification and subgroup discovery, but
a detailed understanding of the types of interactions among items in a
discriminative pattern is lacking. To address this issue, we propose to
categorize discriminative patterns according to four types of item interaction:
(i) driver-passenger, (ii) coherent, (iii) independent additive and (iv)
synergistic beyond independent additive. Either of the last three is of
practical importance, with the latter two representing a gain in the
discriminative power of a pattern over its subsets. Synergistic patterns are
most restrictive, but perhaps the most interesting since they capture a
cooperative effect. For domains such as genetic research, differentiating among
these types of patterns is critical since each yields very different biological
interpretations. For general domains, the characterization provides a novel
view of the nature of the discriminative patterns in a dataset, which yields
insights beyond those provided by current approaches that focus mostly on
pattern-based classification and subgroup discovery. This paper presents a
comprehensive discussion that defines these four pattern types and investigates
their properties and their relationship to one another. In addition, these
ideas are explored for a variety of datasets (ten UCI datasets, one gene
expression dataset and two genetic-variation datasets). The results demonstrate
the existence, characteristics and statistical significance of the different
types of patterns. They also illustrate how pattern characterization can
provide novel insights into discriminative pattern mining and the
discriminative structure of different datasets. |
Year | Venue | Keywords |
---|---|---|
2011 | Clinical Orthopaedics and Related Research | statistical significance,genetics,genetic variation,gene expression,information theory |
Field | DocType | Volume |
Data mining,Categorization,Pattern recognition,Computer science,Discriminative pattern mining,Artificial intelligence,Discriminative model | Journal | abs/1102.4 |
Citations | PageRank | References |
3 | 0.40 | 23 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Gang Fang | 1 | 78 | 4.68 |
Wen Wang | 2 | 3 | 0.40 |
Benjamin Oatley | 3 | 3 | 0.40 |
Brian Van Ness | 4 | 3 | 0.40 |
Michael Steinbach | 5 | 1760 | 91.22 |
Vipin Kumar | 6 | 11560 | 934.35 |