Abstract | ||
---|---|---|
Data analytics and scientific computing are two modern applications that in recent years have substantially changed their computation and communication needs, requiring additional processing capability and bandwidth to be able to keep pace with current demands. These applications are commonly processed within data centers, exchanging enormous volumes of data, rapidly stressing existing network infrastructures. Thus, it is crucial for data center operations and management to be able to understand and classify the communication demands of these applications. The traditional approaches for classifying application traffic are port-based and Deep Packet Inspection, both presenting issues with current network technology. Some recent works propose using machine learning plus statistical information collected from application flows to classify traffic. Applications running in data centers present communication patterns which can be recognized through their traffic matrices. So, the main contribution of this paper is a method that explores the textural information extracted from these matrices to classify the data center traffic using machine learning techniques. As a proof-of-concept, we implemented this method in a system named DCTraCS. The experimental dataset was gathered from two real data centers, collecting the traffic matrices of MapReduce and a set of scientific applications every second for a period of 30 minutes. For assessing our proposal, we compared it with other machine learning techniques for classifying application traffic found in current literature. Results show that our approach achieved the highest accuracy, classifying correctly over 99% of our data center applications. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/AINA.2018.00161 | PROCEEDINGS 2018 IEEE 32ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA) |
Keywords | Field | DocType |
data center communication, pattern recognition, traffic matrix | Deep packet inspection,Data mining,Pace,Data analysis,Computer science,Visualization,Support vector machine,Feature extraction,Bandwidth (signal processing),Data center,Distributed computing | Conference |
ISSN | Citations | PageRank |
1550-445X | 0 | 0.34 |
References | Authors | |
0 | 10 |
Name | Order | Citations | PageRank |
---|---|---|---|
Celio Trois | 1 | 24 | 4.22 |
Luis Carlos Erpen De Bona | 2 | 228 | 17.69 |
Luiz S. Oliveira | 3 | 476 | 47.22 |
Magnos Martinello | 4 | 116 | 20.23 |
Douglas Harewood-Gill | 5 | 0 | 0.34 |
Marcos Didonet Del Fabro | 6 | 273 | 34.14 |
Reza Nejabati | 7 | 211 | 43.72 |
Dimitra Simeonidou | 8 | 332 | 80.03 |
João Carlos D. Lima | 9 | 6 | 6.59 |
Benhur Stein | 10 | 0 | 0.34 |