Title
Hadoop based Deep Packet Inspection system for traffic analysis of e-business websites
Abstract
Internet traffic is experiencing an explosive growth, and online shopping is one of the significant drivers. However, alert network operators, unwilling to be dumb pipes, are making every effort to mine mass traffic with the help of Deep Packet Inspection (DPI) which is regarded as a big challenge especially for massive data when traditional methods and programming model are utilized. Hadoop provides an alternative approach with its strength in distributed storage and parallel computing. In this paper, a Hadoop based DPI system was reported, which was integrated with a web crawler. The system architecture and MapReduce models of packet analysis, web URL restoration were presented. As an example, live web traffic visiting the Tmall, the leading e-shopping giant in China, was specifically investigated using this system. Popularity of product, category and brand for a certain period was evaluated from page views of product. The detailed information of products was provided by the product information base built by the web crawler. This work explored the methodology of using Hadoop in DPI and presented valuable guidelines to develop such a system, which can be further used in analyzing other services and mining the value of network traffic by network operators.
Year
DOI
Venue
2014
10.1109/DSAA.2014.7058097
DSAA
Keywords
DocType
Citations 
parallel processing,distributed storage,e-shopping giant,mapreduce models,e-business websites,web url restoration,retail data processing,china,information retrieval,traffic analysis,brand,online shopping,mass traffic mining,product page view,product popularity,hadoop based deep packet inspection system,hadoop based dpi system,web sites,internet,category,data mining,internet traffic,telecommunication traffic,electronic commerce,parallel computing,web crawler,packet analysis,web pages,cloud computing,inspection,databases
Conference
1
PageRank 
References 
Authors
0.38
0
4
Name
Order
Citations
PageRank
Jiangtao Luo163.50
Yan Liang210.38
W. Gao319333.48
Junchao Yang410.38