Title
Improving Cloud Storage Search with User Activity
Abstract
ABSTRACTCloud-based file storage platforms such as Google Drive are widely used as a means for storing, editing and sharing personal and organizational documents. In this paper, we improve search ranking quality for cloud storage platforms by utilizing user activity logs. Different from search logs, activity logs capture general document usage activity beyond search, such as opening, editing and sharing documents. We propose to automatically learn text embeddings that are effective for search ranking from activity logs. We develop a novel co-access signal, i.e., whether two documents were accessed by a user around the same time, to train deep semantic matching models that are useful for improving the search ranking quality. We confirm that activity-trained semantic matching models can improve ranking by conducting extensive offline experimentation using Google Drive search and activity logs. To the best of our knowledge, this is the first work to examine the benefits of leveraging document usage activity at large scale for cloud storage search; as such it can shed light on using such activity in scenarios where direct collection of search-specific interactions (e.g., query and click logs) may be expensive or infeasible.
Year
DOI
Venue
2021
10.1145/3437963.3441780
WSDM
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Rolf Jagerman110.69
Weize Kong2784.87
Rama Kumar Pasumarthi3162.25
Zhen Qin413816.93
Michael Bendersky598648.69
Marc A. Najork62538278.16