Abstract | ||
---|---|---|
We describe and experimentally evaluate a complete method for the automatic acquisition of two-level rules for morphological analyzers/generators. The input to the system is sets of source-target word pairs, where the target is an inflected form of the source. There are two phases in the acquisition process: (1) segmentation of the target into morphemes and (2) determination of the optimal two-level rule set with minimal discerning contexts. In phase one, a minimal acyclic finite state automaton (AFSA) is constructed from string edit sequences of the input pairs. Segmentation of the words into morphemes is achieved through viewing the AFSA as a directed acyclic graph (DAG) and applying heuristics using properties of the DAG as well as the elementary edit operations. For phase two, the determination of the optimal rule set is made possible with a novel representation of rule contexts, with morpheme boundaries added, in a new DAG. We introduce the notion of a delimiter edge. Delimiter edges are used to select the correct two-level rule type as well as to extract minimal discerning rule contexts from the DAG. Results are presented for English adjectives, Xhosa noun locatives and Afrikaans noun plurals. |
Year | DOI | Venue |
---|---|---|
1997 | 10.3115/974557.974573 | ANLP |
Keywords | Field | DocType |
automatic acquisition,minimal acyclic finite state,minimal discerning context,two-level morphological rule,correct two-level rule type,two-level rule,delimiter edge,new dag,rule context,optimal rule set,optimal two-level rule,minimal discerning rule context,directed acyclic graph,computational linguistics,finite state automaton,noun | Morpheme,Computer science,Noun,Computational linguistics,Finite-state machine,Directed acyclic graph,Heuristics,Artificial intelligence,Natural language processing,Delimiter,Directed acyclic word graph | Conference |
Citations | PageRank | References |
16 | 3.07 | 6 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Pieter Theron | 1 | 16 | 3.07 |
Ian Cloete | 2 | 132 | 16.61 |