Abstract | ||
---|---|---|
10GbE connectivity is expected to be a standard feature of server platforms in the near future. Among the numerous methods and features proposed to improve network performance of such platforms is Direct Cache Access (DCA) to route incoming I/O to CPU caches directly. While this feature has been shown to be promising, there can be significant challenges when dealing with high rates of traffic multiprocessor and multi-core environment. In this paper, we focus on two practical considerations with DCA. In the first case, we show that the performance benefit from DCA can be limited when network traffic processing rate cannot match the I/O rate. In the second case, we show that affinitizing both stack and application contexts to cores that share a cache is critical. With proper distribution and affinity, we show that a standard Linux network stack runs 32% faster for 2KB to 64KB I/O sizes. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/HPCA.2009.4798271 | HPCA-15 2009: FIFTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS |
Keywords | Field | DocType |
hardware,kernel,network performance,numerical method,protocols,linux,cpu cache,throughput | Kernel (linear algebra),Cache,CPU cache,Computer science,Parallel computing,Computer network,Multiprocessing,Throughput,Protocol stack,Multi-core processor,Network performance | Conference |
ISSN | Citations | PageRank |
1530-0897 | 16 | 1.12 |
References | Authors | |
11 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Amit Kumar | 1 | 18 | 1.56 |
Ram Huggahalli | 2 | 358 | 20.94 |
Srihari Makineni | 3 | 600 | 37.89 |