Title
Annotated Gigaword
Abstract
We have created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics. Most existing large-scale work is based on inconsistent corpora which often have needed to be re-annotated by research teams independently, each time introducing biases that manifest as results that are only comparable at a high level. We provide to the community a public reference set based on current state-of-the-art syntactic analysis and coreference resolution, along with an interface for programmatic access. Our goal is to enable broader involvement in large-scale knowledge-acquisition efforts by researchers that otherwise may not have had the ability to produce such a resource on their own.
Year
Venue
Keywords
2012
AKBC-WEKEX@NAACL-HLT
large-scale knowledge-acquisition effort,English Gigaword v,standardized corpus,existing large-scale work,current state-of-the-art syntactic analysis,broader involvement,coreference resolution,distributional semantics,high level,Annotated Gigaword,inconsistent corpus
DocType
ISBN
Citations 
Conference
978-1-4503-2411-3
2
PageRank 
References 
Authors
0.39
11
3
Name
Order
Citations
PageRank
Courtney Napoles114812.44
Matthew Gormley28410.25
Benjamin Van Durme3126892.32