Title
Privacy-preserving categorization of mobile applications based on large-scale usage data
Abstract
Categorization of mobile applications (apps) according to their functionalities is essential for app stores in maintaining a huge quantity of apps efficiently and securely. The problem in existing methods is that the apps are uploaded from untrusted sources and the static features extracted for categorization can be easily masked by obfuscation or encryption. To solve this problem and improve the categorization accuracy, we propose to extract features from usage data generated by apps running on mobile devices. Usage data, such as average running time or number of active users of an app, is hard to be manipulated by untrusted developers, while different types of apps generate different usage patterns. Based on this observation, we propose a new privacy-preserving categorization method of mobile apps based on learning patterns from a large scale of usage data. Firstly, the usage data collected from different users is anonymized by shuffling. Then we formalize the usage data as time series, extract and cluster usage data for each app based on Dynamic Time Warping. We utilize the Shape Features to segment the clustered time series and transform them into feature vectors. Finally, we adopt five machine learning methods to train and test the categorization models on 3,086 apps. The results show that SVM performs the best. When we exclude apps with the small number of the usage data flows under 50,000, the categorization performance (F1-score) of our method is improved to be over 96%, which is significantly better than the previous methods.
Year
DOI
Venue
2020
10.1016/j.ins.2019.11.007
Information Sciences
Keywords
Field
DocType
Dynamic usage data,Time series,Privacy,Untrusted apps
Data mining,Categorization,Feature vector,Dynamic time warping,Upload,Encryption,Mobile device,Artificial intelligence,Usage data,Obfuscation,Machine learning,Mathematics
Journal
Volume
ISSN
Citations 
514
0020-0255
1
PageRank 
References 
Authors
0.34
0
6
Name
Order
Citations
PageRank
Yongzhong He110.34
Chao Wang210.34
Guangquan Xu317133.20
Wenjuan Lian491.77
Hequn Xian5124.87
Wei Wang67122746.33