Title
High-Speed Identification of Language and Script
Abstract
Humans communicate with text in thousands of languages, in dozens of scripts, and a wide variety of binary codes. There is a need to identify the language, script and code of this text to enable follow-on processing such as transcoding, translation, transliteration, routing and prioritization. This paper deals with the implementation of real-time language and script identification on high-speed hardware (principally a ternary content addressable memory) capable of processing network data streams at several gigabits per second.
Year
DOI
Venue
2007
10.1109/ICDMW.2007.54
ICDM Workshops
Keywords
Field
DocType
wide variety,high-speed hardware,ternary content addressable memory,binary code,real-time language,paper deal,high-speed identification,processing network data stream,script identification,follow-on processing,pattern matching,associative memory,background noise,natural languages,data mining,field programmable gate arrays,java,hardware,real time
Data mining,Transcoding,Content-addressable memory,Programming language,Computer science,Artificial intelligence,Binary code,Field-programmable gate array,Natural language,Pattern matching,Machine learning,Scripting language,Transliteration
Conference
ISBN
Citations 
PageRank 
0-7695-3033-8
0
0.34
References 
Authors
3
2
Name
Order
Citations
PageRank
Alan Ratner111.40
Ron Loui200.34