Abstract | ||
---|---|---|
Identifying the language of an e-text is complicated by the existence of a number of character sets for a single language.
We present a language identification system that uses the Multivariate Analysis (MVA) for dimensionality reduction and classification.
We compare its performance with existing schemes viz., the N-grams and compression.
|
Year | DOI | Venue |
---|---|---|
2005 | 10.1007/978-3-540-30586-6_89 | CICLing |
Keywords | Field | DocType |
multivariate analysis,language identification | Dimensionality reduction,Pattern recognition,Automatic language identification,Computer science,Speech recognition,Natural language processing,Artificial intelligence,Language identification,Character encoding,Multivariate analysis,Language family | Conference |
ISBN | Citations | PageRank |
3-540-24523-5 | 2 | 0.41 |
References | Authors | |
2 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Vinosh Babu James | 1 | 2 | 0.41 |
Baskaran Sankaran | 2 | 155 | 13.65 |