Abstract | ||
---|---|---|
We present a new release of the Czech-English parallel corpus CzEng. CzEng 1.6 consists of about 0.5 billion words ("gigaword") in each language. The corpus is equipped with automatic annotation at a deep syntactic level of representation and alternatively in Universal Dependencies. Additionally, we release the complete annotation pipeline as a virtual machine in the Docker virtualization toolkit. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/978-3-319-45510-5_27 | Lecture Notes in Artificial Intelligence |
Keywords | Field | DocType |
Parallel corpus,Automatic annotation,Machine translation | Virtualization,Czech,Public records,Virtual machine,Annotation,Information retrieval,Computer science,Machine translation,Universal dependencies,Natural language processing,Artificial intelligence,Syntax | Conference |
Volume | ISSN | Citations |
9924 | 0302-9743 | 3 |
PageRank | References | Authors |
0.40 | 12 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ondřej Bojar | 1 | 1701 | 122.71 |
Ondřej Dušek | 2 | 180 | 23.08 |
Tom Kocmi | 3 | 3 | 0.74 |
Jindrich Libovicky | 4 | 18 | 5.20 |
Michal Novák | 5 | 55 | 4.03 |
Martin Popel | 6 | 269 | 21.27 |
Roman Sudarikov | 7 | 3 | 0.40 |
Dusan Varis | 8 | 4 | 5.14 |