Title
Estimation of Subspace Arrangements with Applications in Modeling and Segmenting Mixed Data.
Abstract
Recently many scientific and engineering applications have involved the challenging task of analyzing large amounts of unsorted high-dimensional data that have very complicated structures. From both geometric and statistical points of view, such unsorted data are considered mixed as different parts of the data have significantly different structures which cannot be described by a single model. In this paper we propose to use subspace arrangements—a union of multiple subspaces—for modeling mixed data: each subspace in the arrangement is used to model just a homogeneous subset of the data. Thus, multiple subspaces together can capture the heterogeneous structures within the data set. In this paper, we give a comprehensive introduction to a new approach for the estimation of subspace arrangements. This is known as generalized principal component analysis (GPCA). In particular, we provide a comprehensive summary of important algebraic properties and statistical facts that are crucial for making the inference of subspace arrangements both efficient and robust, even when the given data are corrupted by noise or contaminated with outliers. This new method in many ways improves and generalizes extant methods for modeling or clustering mixed data. There have been successful applications of this new method to many real-world problems in computer vision, image processing, and system identification. In this paper, we will examine several of those representative applications. This paper is intended to be expository in nature. However, in order that this may serve as a more complete reference for both theoreticians and practitioners, we take the liberty of filling in several gaps between the theory and the practice in the existing literature.
Year
DOI
Venue
2008
10.1137/060655523
SIAM Review
Keywords
Field
DocType
subspace arrangement,multiple subspaces,comprehensive introduction,model selection,unsorted high-dimensional data,subspace arrangements,comprehensive summary,mixed data,unsorted data,new approach,outlier detection,minimum e!ectiv e dimension,new method,generalized principal component analysis,hilbert function
Anomaly detection,Data mining,Mathematical optimization,Subspace topology,Inference,Model selection,Outlier,Algorithm,Linear subspace,System identification,Cluster analysis,Mathematics
Journal
Volume
Issue
ISSN
50
3
0036-1445
Citations 
PageRank 
References 
109
4.71
17
Authors
4
Search Limit
100109
Name
Order
Citations
PageRank
Yi Ma114931536.21
Allen Y. Yang25216183.98
Harm Derksen315115.00
r m fossum41156.19