Title
CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered.
Abstract
We present a new release of the Czech-English parallel corpus CzEng. CzEng 1.6 consists of about 0.5 billion words ("gigaword") in each language. The corpus is equipped with automatic annotation at a deep syntactic level of representation and alternatively in Universal Dependencies. Additionally, we release the complete annotation pipeline as a virtual machine in the Docker virtualization toolkit.
Year
DOI
Venue
2016
10.1007/978-3-319-45510-5_27
Lecture Notes in Artificial Intelligence
Keywords
Field
DocType
Parallel corpus,Automatic annotation,Machine translation
Virtualization,Czech,Public records,Virtual machine,Annotation,Information retrieval,Computer science,Machine translation,Universal dependencies,Natural language processing,Artificial intelligence,Syntax
Conference
Volume
ISSN
Citations 
9924
0302-9743
3
PageRank 
References 
Authors
0.40
12
8
Name
Order
Citations
PageRank
Ondřej Bojar11701122.71
Ondřej Dušek218023.08
Tom Kocmi330.74
Jindrich Libovicky4185.20
Michal Novák5554.03
Martin Popel626921.27
Roman Sudarikov730.40
Dusan Varis845.14