Title | ||
---|---|---|
Approximate matching of persistent LExicon using search-engines for classifying Mobile app traffic |
Abstract | ||
---|---|---|
We present AMPLES, Approximate Matching of Persistent LExicon using Search-Engines, to address the Mobile-Application-Identification (MApId) problem in network traffic at a per-flow granularity. We transform MApId into an information-retrieval problem where lexical similarity of short-text-documents is used as a metric for classification tasks. Specifically, a network-flow, observed at an intercept-point, is treated as a semi-structured-text-document and modified into a flow-query. This query is then run against a corpus of documents pre-indexed in a search-engine. Each index-document represents an application, and consists of distinguishable identifiers from the metadata-file and URL-strings found in the application's executable-archive. The search-engine acts as a kernel function, generating a score distribution vis-'a-vis the index-documents, to determine a match. This extends the scope of MApId to fuzzy-classification mapping a flow to a family of apps when the score distribution is spread-out. Through experiments over an emulator-generated test-dataset (400 K applications and 13.5 million flows), we obtain over 80% flow coverage and about 85% application coverage with low false-positives (4%) and nearly no false-negatives. We also validate our methodology over a real network trace. Most importantly, our methodology is platform agnostic, and subsumes previous studies, most of which focus solely on the application coverage. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/INFOCOM.2016.7524386 | IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications |
Keywords | Field | DocType |
approximate matching of persistent lexicon using search-engines,AMPLES,mobile-application-identification,MApId classification,information retrieval,network flow,semistructured text document,flow query,metadata file,URL string,fuzzy classification mapping | Lexical similarity,Data mining,Mobile app,Search engine,Identifier,Computer science,Lexicon,Approximate matching,Granularity,Kernel (statistics) | Conference |
ISSN | ISBN | Citations |
0743-166X | 978-1-4673-9954-8 | 6 |
PageRank | References | Authors |
0.50 | 12 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Gyan Ranjan | 1 | 35 | 2.26 |
Alok Tongaonkar | 2 | 241 | 14.88 |
Ruben Torres | 3 | 36 | 3.07 |