Abstract | ||
---|---|---|
We present a method of applying a broad-coverage LFG grammar of German in the process of semi-automatic lexicon acquisition from corpora. The identification of corpus instances that illustrate a certain subcategorization frame uniquely is done by a comparison of the numbers of analyses the grammar assigns to the corpus instances, under the assumption of different hypothetical lexicon entries for the candidate verb. Filtering conditions expressed on the feature representation output by the grammar further restrict the sentences that the automatic extraction step is based on. Experiments show that the grammar-based method produces better results than a method based on patterns in a corpus query language. 1. Background This paper reports ongoing research activities in methods for semi-automatic lexicon acquisition from corpora (cf. also (Eckle and Heid, 1996; Eckle-Kohler, 1998)). As a test application, the lexical resources constructed with various methods are being used in a broad-coverage LFG gram- mar of German under development at the IMS Stuttgart. With the method reported in this paper, a bootstrapping cy- cle is closed: the lexical resources are no longer just applied in the LFG grammar, but application of the grammar also feeds back into the construction of further resources. |
Year | Venue | Keywords |
---|---|---|
1998 | LREC | query language |
DocType | Citations | PageRank |
Conference | 4 | 0.55 |
References | Authors | |
1 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jonas Kuhn | 1 | 34 | 8.90 |
Judith Eckle-Kohler | 2 | 156 | 14.85 |
Christian Rohrer | 3 | 26 | 3.48 |