Abstract | ||
---|---|---|
Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the Sketch Engine corpus management system. A completely new resource is introduced: EUR-Lex corpus, being one of the largest parallel corpus available at the moment, containing 840 million tokens of English and having the largest language pair (English-French) with more than 25 million aligned segments (paragraphs). |
Year | Venue | Keywords |
---|---|---|
2016 | LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | Acquis,DCEP,DGT-TM,Europarl,EUR-Lex,Sketch Engine,parallel corpus,word sketch,parallel concordance |
Field | DocType | Citations |
Word sketch,Computer science,Parallel corpora,Speech recognition,Natural language processing,Artificial intelligence,Management system,Linguistics,Sketch,European union | Conference | 0 |
PageRank | References | Authors |
0.34 | 2 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Vít Baisa | 1 | 6 | 5.56 |
Jan Michelfeit | 2 | 4 | 0.83 |
Marek Medved | 3 | 0 | 1.01 |
Milos Jakubícek | 4 | 25 | 4.42 |