Abstract | ||
---|---|---|
We have created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics. Most existing large-scale work is based on inconsistent corpora which often have needed to be re-annotated by research teams independently, each time introducing biases that manifest as results that are only comparable at a high level. We provide to the community a public reference set based on current state-of-the-art syntactic analysis and coreference resolution, along with an interface for programmatic access. Our goal is to enable broader involvement in large-scale knowledge-acquisition efforts by researchers that otherwise may not have had the ability to produce such a resource on their own.
|
Year | Venue | Keywords |
---|---|---|
2012 | AKBC-WEKEX@NAACL-HLT | large-scale knowledge-acquisition effort,English Gigaword v,standardized corpus,existing large-scale work,current state-of-the-art syntactic analysis,broader involvement,coreference resolution,distributional semantics,high level,Annotated Gigaword,inconsistent corpus |
DocType | ISBN | Citations |
Conference | 978-1-4503-2411-3 | 2 |
PageRank | References | Authors |
0.39 | 11 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Courtney Napoles | 1 | 148 | 12.44 |
Matthew Gormley | 2 | 84 | 10.25 |
Benjamin Van Durme | 3 | 1268 | 92.32 |