Abstract | ||
---|---|---|
With the rise in the amount information of being streamed across networks, there is a growing demand to vet the quality, type and content itself for various purposes such as spam, security and search. In this paper, we develop an energy-efficient high performance information filtering system that is capable of classifying a stream of incoming document at high speed. The prototype parses a stream of documents using a multicore CPU and then performs classification using Field-Programmable Gate Arrays (FPGAs). On a large TREC data collection, we implemented a Naive Bayes classifier on our prototype and compared it to an optimized CPU based-baseline. Our empirical findings show that we can classify documents at 10Gb/s which is up to 94 times faster than the CPU baseline (and up to 5 times faster than previous FPGA based implementations). In future work, we aim to increase the throughput by another order of magnitude by implementing both the parser and filter on the FPGA. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1145/2505515.2507866 | CIKM |
Keywords | Field | DocType |
energy-efficient high performance information,amount information,high throughput,optimized cpu based-baseline,previous fpga,naive bayes classifier,field-programmable gate arrays,high speed,multicore cpu,cpu baseline,empirical finding,filtering,parsing,classification,efficiency,fpga | Data mining,Data collection,Central processing unit,Naive Bayes classifier,Computer science,Filter (signal processing),Field-programmable gate array,Real-time computing,Parsing,Throughput,Information filtering system | Conference |
Citations | PageRank | References |
0 | 0.34 | 10 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wim Vanderbauwhede | 1 | 226 | 37.98 |
Anton Frolov | 2 | 3 | 0.82 |
Leif Azzopardi | 3 | 1919 | 133.10 |
Sai Rahul Chalamalasetti | 4 | 136 | 16.33 |
Martin Margala | 5 | 318 | 55.78 |