Title
Peer-to-Peer Traffic Identification by Mining IP Layer Data Streams Using Concept-Adapting Very Fast Decision Tree
Abstract
We apply streaming data mining techniques, and in particular, concept-adapting very fast decision tree (CVFDT) to identify peer-to-peer (P2P) applications in Internet traffic, as the Internet data flows dynamically in large volumes (streaming data), and in P2P applications, new communities of peers often attend and old communities of peers often leave, requiring the identification methods to be capable of coping with concept drift, and updating the model incrementally. We captured Internet traffic at a main gateway router, performed pre-processing on the captured data, selected the most significant attributes, and prepared a training data stream to which the CVFDT model was applied. We tested our approach on a data stream with 3.5 million P2P and NonP2P traffic records. The results show that our approach can effectively deal with dynamic nature of streaming data and detect the changes in communities of peers. The classification accuracy is higher than 95%, and the method is well-scalable in both time and space complexities, making it competent for large-scale dynamic data. We extracted attributes only from the IP layer, eliminating the privacy concern associated with the techniques that use deep packet inspection.
Year
DOI
Venue
2008
10.1109/ICTAI.2008.12
Tools with Artificial Intelligence, 2008. ICTAI '08. 20th IEEE International Conference
Keywords
Field
DocType
IP networks,Internet,data mining,decision trees,internetworking,peer-to-peer computing,telecommunication network routing,telecommunication traffic,IP layer data stream mining,Internet data flows,Internet traffic,concept-adapting very fast decision tree,deep packet inspection,gateway router,peer-to-peer traffic identification,CVFDT,Concept Drift,IP Traffic Identification,Peer-to-Peer Traffic,Stream Data Mining
Deep packet inspection,Data stream mining,Data stream,Computer science,Computer network,Concept drift,Internetworking,Dynamic data,Internet traffic,The Internet
Conference
Volume
ISSN
ISBN
1
1082-3409
978-0-7695-3440-4
Citations 
PageRank 
References 
9
0.52
12
Authors
3
Name
Order
Citations
PageRank
Bijan Raahemi115522.29
Weicai Zhong238126.14
Jing Liu31043115.54