Title
Disk-based Matrix Completion for Memory Limited Devices.
Abstract
More and more data need to be processed or analyzed within mobile devices for efficiency or privacy reasons, but performing machine learning tasks with large data within the devices is challenging because of their limited memory resources. For this reason, disk-based machine learning methods have been actively researched, which utilize storage resources without holding all the data in memory. This paper proposes D-MC2, a novel disk-based matrix completion method that (1) supports incremental data update (i.e., data insertion and deletion) and (2) spills both data and model to disk when necessary; these functionalities are not supported by existing methods. First, D-MC2 builds a two-layered index to efficiently support incremental data update; there exists a trade-off relationship between model learning and data update costs, and our two-layered index simultaneously optimizes the two costs. Second, we develop a window-based stochastic gradient descent (SGD) scheduler to efficiently support the dual spilling; a huge amount of disk I/O is incurred when the size of model is larger than that of memory, and our new scheduler substantially reduces it. Our evaluation results show that D-MC2 is significantly more scalable and faster than other disk-based competitors under the limited memory environment. In terms of the co-optimization, D-MC2 outperforms the baselines that only optimize one of the two costs up to 48x. Furthermore, the window-based scheduler improves the training speed 12.4x faster compared to a naive scheduler.
Year
DOI
Venue
2018
10.1145/3269206.3271685
CIKM
Keywords
Field
DocType
Matrix completion, Stochastic gradient descent, Data management
Data mining,Stochastic gradient descent,Existential quantification,Matrix completion,Computer science,Mobile device,Data management,Model learning,Distributed computing,Scalability
Conference
ISBN
Citations 
PageRank 
978-1-4503-6014-2
0
0.34
References 
Authors
19
5
Name
Order
Citations
PageRank
Dongha Lee1146.77
Jinoh Oh230315.32
Christos Faloutsos3279724490.38
Byungju Kim491.91
Hwanjo Yu51715114.02