Abstract | ||
---|---|---|
We have proposed a method of word segmentation for non-segmented language using Inductive Learning. This method uses only surface information of a text, so that it has an advantage that is entirely not dependent on any specific language. In this method, we consider that a character string of appearing frequently in a text has a high possibility as a word. The method predicts unknown words by recursively extracting common character strings. With the proposed method, the segmentation results can adapt to different users and fields. To evaluate effectivety for Chinese word segmentation and adaptability for different fields, we have done the evaluation experiment with Chinese text of the two fields. |
Year | DOI | Venue |
---|---|---|
2002 | 10.3115/1118824.1118836 | SIGHAN@COLING |
Keywords | DocType | Citations |
character string,word segmentation,chinese text,dynamic adapting,segmentation result,different field,unknown word,inductive learning,word segmentation method,different user,common character string,chinese word segmentation | Conference | 3 |
PageRank | References | Authors |
0.40 | 2 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhongjian Wang | 1 | 3 | 2.43 |
Kenji Araki | 2 | 343 | 80.17 |
Koji Tochinai | 3 | 60 | 16.90 |