Abstract | ||
---|---|---|
Humans communicate with text in thousands of languages, in dozens of scripts, and a wide variety of binary codes. There is a need to identify the language, script and code of this text to enable follow-on processing such as transcoding, translation, transliteration, routing and prioritization. This paper deals with the implementation of real-time language and script identification on high-speed hardware (principally a ternary content addressable memory) capable of processing network data streams at several gigabits per second. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1109/ICDMW.2007.54 | ICDM Workshops |
Keywords | Field | DocType |
wide variety,high-speed hardware,ternary content addressable memory,binary code,real-time language,paper deal,high-speed identification,processing network data stream,script identification,follow-on processing,pattern matching,associative memory,background noise,natural languages,data mining,field programmable gate arrays,java,hardware,real time | Data mining,Transcoding,Content-addressable memory,Programming language,Computer science,Artificial intelligence,Binary code,Field-programmable gate array,Natural language,Pattern matching,Machine learning,Scripting language,Transliteration | Conference |
ISBN | Citations | PageRank |
0-7695-3033-8 | 0 | 0.34 |
References | Authors | |
3 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Alan Ratner | 1 | 1 | 1.40 |
Ron Loui | 2 | 0 | 0.34 |