Title
Integrative Generalized Convex Clustering Optimization And Feature Selection For Mixed Multi-View Data
Abstract
In mixed multi-view data, multiple sets of diverse features are measured on the same set of samples. By integrating all available data sources, we seek to discover common group structure among the samples that may be hidden in individualistic cluster analyses of a single data view. While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that enjoys strong empirical performance and inherits the mathematical properties of increasingly popular convex clustering methods. Specifically, our Integrative Generalized Convex Clustering Optimization (iGecco) method employs different convex distances, losses, or divergences for each of the different data views with a joint convex fusion penalty that leads to common groups. Additionally, integrating mixed multi-view data is often challenging when each data source is high-dimensional. To perform feature selection in such scenarios, we develop an adaptive shifted grouplasso penalty that selects features by shrinking them towards their loss-specific centers. Our so-called iGecco+ approach selects features from each data view that are best for determining the groups, often leading to improved integrative clustering. To solve our problem, we develop a new type of generalized multi-block ADMM algorithm using subproblem approximations that more efficiently fits our model for big data sets. Through a series of numerical experiments and real data examples on text mining and genomics, we show that iGecco+ achieves superior empirical performance for high-dimensional mixed multi-view data.
Year
DOI
Venue
2021
v22/19-1012.html
JOURNAL OF MACHINE LEARNING RESEARCH
Keywords
DocType
Volume
Integrative clustering, convex clustering, feature selection, convex optimization, sparse clustering, GLM deviance, Bregman divergences
Journal
22
Issue
ISSN
Citations 
55
1532-4435
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Minjie Wang131513.62
Genevera I. Allen28911.18