Name
Affiliation
Papers
SHAM KAKADE
Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA
182
Collaborators
Citations 
PageRank 
204
4365
282.77
Referers 
Referees 
References 
6354
1462
1395
Search Limit
1001000
Title
Citations
PageRank
Year
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression.00.342022
Multi-Stage Episodic Control for Strategic Exploration in Text Games00.342022
Understanding Contrastive Learning Requires Incorporating Inductive Biases.00.342022
Benign Overfitting of Constant-Stepsize SGD for Linear Regression.00.342021
Bilinear Classes: A Structural Framework For Provable Generalization In Rl00.342021
Optimal Regularization Can Mitigate Double Descent00.342021
The Benefits of Implicit Regularization from SGD in Least Squares Problems.00.342021
How Important Is The Train-Validation Split In Meta-Learning?00.342021
Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning00.342020
Information Theoretic Regret Bounds for Online Nonlinear Control00.342020
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity00.342020
Soft Threshold Weight Reparameterization for Learnable Sparsity00.342020
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs00.342020
The Implicit and Explicit Regularization Effects of Dropout00.342020
PACT: Privacy-Sensitive Protocols And Mechanisms for Mobile Contact Tracing.00.342020
Model-based reinforcement learning with a generative model is minimax optimal00.342020
The Nonstochastic Control Problem00.342020
Meta-Learning with Implicit Gradients40.392019
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure.00.342019
Online Meta-Learning.00.342019
A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm.10.362019
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares00.342019
Maximum Likelihood Estimation for Learning Populations of Parameters.00.342019
Stochastic subgradient method converges on tame functions.100.592018
Global Convergence of Policy Gradient Methods for Linearized Control Problems.70.532018
Provably Efficient Maximum Entropy Exploration.10.352018
Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control.40.392018
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines.120.552018
Variance Reduction Methods for Sublinear Reinforcement Learning.30.402018
On the Insufficiency of Existing Momentum Schemes for Stochastic Optimization00.342018
Recovering Structured Probability Matrices.00.342018
On the Insufficiency of Existing Momentum Schemes for Stochastic Optimization50.432018
Coupled Recurrent Models for Polyphonic Music Composition.00.342018
A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares).30.422017
Accelerating Stochastic Gradient Descent.00.342017
Minimal Realization Problems for Hidden Markov Models.00.342016
Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging.90.552016
Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis.80.482016
Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation110.642015
Super-Resolution Off the Grid10.352015
Computing Matrix Squareroot via Non Convex Local Search30.382015
A Linear Dynamical System Model for Text.70.472015
Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization371.252015
Convergence Rates of Active Learning for Maximum Likelihood Estimation80.532015
Competing with the Empirical Risk Minimizer in a Single Pass.281.712015
A tensor approach to learning mixed membership community models411.232014
Random Design Analysis of Ridge Regression331.542014
Optimal Dynamic Mechanism Design and the Virtual-Pivot Mechanism.272.082013
Least Squares Revisited: Scalable Approaches for Multi-class Prediction.110.622013
When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity.30.402013
  • 1
  • 2