Approximate matching of persistent LExicon using search-engines for classifying Mobile app traffic - Citegraph

Paper Info

Title
Approximate matching of persistent LExicon using search-engines for classifying Mobile app traffic

Abstract
We present AMPLES, Approximate Matching of Persistent LExicon using Search-Engines, to address the Mobile-Application-Identification (MApId) problem in network traffic at a per-flow granularity. We transform MApId into an information-retrieval problem where lexical similarity of short-text-documents is used as a metric for classification tasks. Specifically, a network-flow, observed at an intercept-point, is treated as a semi-structured-text-document and modified into a flow-query. This query is then run against a corpus of documents pre-indexed in a search-engine. Each index-document represents an application, and consists of distinguishable identifiers from the metadata-file and URL-strings found in the application's executable-archive. The search-engine acts as a kernel function, generating a score distribution vis-'a-vis the index-documents, to determine a match. This extends the scope of MApId to fuzzy-classification mapping a flow to a family of apps when the score distribution is spread-out. Through experiments over an emulator-generated test-dataset (400 K applications and 13.5 million flows), we obtain over 80% flow coverage and about 85% application coverage with low false-positives (4%) and nearly no false-negatives. We also validate our methodology over a real network trace. Most importantly, our methodology is platform agnostic, and subsumes previous studies, most of which focus solely on the application coverage.

Year	DOI	Venue
2016	10.1109/INFOCOM.2016.7524386	IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications
Keywords	Field	DocType
approximate matching of persistent lexicon using search-engines,AMPLES,mobile-application-identification,MApId classification,information retrieval,network flow,semistructured text document,flow query,metadata file,URL string,fuzzy classification mapping	Lexical similarity,Data mining,Mobile app,Search engine,Identifier,Computer science,Lexicon,Approximate matching,Granularity,Kernel (statistics)	Conference
ISSN	ISBN	Citations
0743-166X	978-1-4673-9954-8	6
PageRank	References	Authors
0.50	12	3

Authors (3 rows)

Cited by (6 rows)

References (12 rows)

Name	Order	Citations	PageRank
Gyan Ranjan	1	35	2.26
Alok Tongaonkar	2	241	14.88
Ruben Torres	3	36	3.07

1