Title | ||
---|---|---|
SAMPLES: Self Adaptive Mining of Persistent LExical Snippets for Classifying Mobile Application Traffic |
Abstract | ||
---|---|---|
We present SAMPLES: Self Adaptive Mining of Persistent LExical Snippets; a systematic framework for classifying network traffic generated by mobile applications. SAMPLES constructs conjunctive rules, in an automated fashion, through a supervised methodology over a set of labeled flows (the training set). Each conjunctive rule corresponds to the lexical context, associated with an application identifier found in a snippet of the HTTP header, and is defined by: (a) the identifier type, (b) the HTTP header-field it occurs in, and (c) the prefix/suffix surrounding its occurrence. Subsequently, these conjunctive rules undergo an aggregate-and-validate step for improving accuracy and determining a priority order. The refined rule-set is then loaded into an application-identification engine where it operates at a per flow granularity, in an extract-and-lookup paradigm, to identify the application responsible for a given flow. Thus, SAMPLES can facilitate important network measurement and management tasks --- e.g. behavioral profiling [29], application-level firewalls [21,22] etc. --- which require a more detailed view of the underlying traffic than that afforded by traditional protocol/port based methods. We evaluate SAMPLES on a test set comprising 15 million flows (approx.) generated by over 700 K applications from the Android, iOS and Nokia market-places. SAMPLES successfully identifies over 90% of these applications with 99% accuracy on an average. This, in spite of the fact that fewer than 2% of the applications are required during the training phase, for each of the three market places. This is a testament to the universality and the scalability of our approach. We, therefore, expect SAMPLES to work with reasonable coverage and accuracy for other mobile platforms --- e.g. BlackBerry and Windows Mobile --- as well. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1145/2789168.2790097 | ACM International Conference on Mobile Computing and Networking |
Keywords | Field | DocType |
Mobile App Identification,Automated Rule Generation | Data mining,Android (operating system),Computer science,Profiling (computer programming),Prefix,Lexical analysis,Header,Snippet,Scalability,Test set | Conference |
Citations | PageRank | References |
22 | 0.88 | 21 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hongyi Yao | 1 | 27 | 1.45 |
Gyan Ranjan | 2 | 35 | 2.26 |
Alok Tongaonkar | 3 | 241 | 14.88 |
Yong Liao | 4 | 249 | 21.07 |
Zhuoqing Morley Mao | 5 | 203 | 10.97 |