Title | ||
---|---|---|
Hadoop based Deep Packet Inspection system for traffic analysis of e-business websites |
Abstract | ||
---|---|---|
Internet traffic is experiencing an explosive growth, and online shopping is one of the significant drivers. However, alert network operators, unwilling to be dumb pipes, are making every effort to mine mass traffic with the help of Deep Packet Inspection (DPI) which is regarded as a big challenge especially for massive data when traditional methods and programming model are utilized. Hadoop provides an alternative approach with its strength in distributed storage and parallel computing. In this paper, a Hadoop based DPI system was reported, which was integrated with a web crawler. The system architecture and MapReduce models of packet analysis, web URL restoration were presented. As an example, live web traffic visiting the Tmall, the leading e-shopping giant in China, was specifically investigated using this system. Popularity of product, category and brand for a certain period was evaluated from page views of product. The detailed information of products was provided by the product information base built by the web crawler. This work explored the methodology of using Hadoop in DPI and presented valuable guidelines to develop such a system, which can be further used in analyzing other services and mining the value of network traffic by network operators. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/DSAA.2014.7058097 | DSAA |
Keywords | DocType | Citations |
parallel processing,distributed storage,e-shopping giant,mapreduce models,e-business websites,web url restoration,retail data processing,china,information retrieval,traffic analysis,brand,online shopping,mass traffic mining,product page view,product popularity,hadoop based deep packet inspection system,hadoop based dpi system,web sites,internet,category,data mining,internet traffic,telecommunication traffic,electronic commerce,parallel computing,web crawler,packet analysis,web pages,cloud computing,inspection,databases | Conference | 1 |
PageRank | References | Authors |
0.38 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jiangtao Luo | 1 | 6 | 3.50 |
Yan Liang | 2 | 1 | 0.38 |
W. Gao | 3 | 193 | 33.48 |
Junchao Yang | 4 | 1 | 0.38 |