Title
Lowcon: A Design-Based Subsampling Approach In A Misspecified Linear Model
Abstract
We consider a measurement constrained supervised learning problem, that is, (i) full sample of the predictors are given; (ii) the response observations are unavailable and expensive to measure. Thus, it is ideal to select a subsample of predictor observations, measure the corresponding responses, and then fit the supervised learning model on the subsample of the predictors and responses. However, model fitting is a trial and error process, and a postulated model for the data could be misspecified. Our empirical studies demonstrate that most of the existing subsampling methods have unsatisfactory performances when the models are misspecified. In this paper, we develop a novel subsampling method, called "LowCon," which outperforms the competing methods when the working linear model is misspecified. Our method uses orthogonal Latin hypercube designs to achieve a robust estimation. We show that the proposed design-based estimator approximately minimizes the so-called worst-case bias with respect to many possible misspecification terms. Both the simulated and real-data analyses demonstrate the proposed estimator is more robust than several subsample least-squares estimators obtained by state-of-the-art subsampling methods. for this article are available online.
Year
DOI
Venue
2021
10.1080/10618600.2020.1844215
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
Keywords
DocType
Volume
Condition number, Experimental design, Least-squares estimation, Worst-case MSE
Journal
30
Issue
ISSN
Citations 
3
1061-8600
0
PageRank 
References 
Authors
0.34
0
6
Name
Order
Citations
PageRank
Meng, Cheng101.01
Rui Xie212.05
Abhyuday Mandal300.68
Xinlian Zhang400.34
Wenxuan Zhong5213.46
Ma, Ping622.73