Abstract | ||
---|---|---|
Textual information written in Chinese now represents a huge knowledge repository. The first step of managing and processing information in written Chinese text is segmentation. A new method for automatic Chinese text segmentation using evolutionary algorithms and Web search statistical data is outlined. This proposed method considers Web text a de facto corpus that updates automatically, thus eliminating the need for statistics training. It treats the segmentation as a process that finds out the best probability of how individual characters are combined into sentences, paragraphs, and articles, thus producing segmentation results that are tailored to the text in question and are independent of segmentation standards. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ICNC.2013.6818079 | ICNC |
Keywords | Field | DocType |
sentences,evolutionary computation,knowledge repository,paragraphs,articles,information retrieval,de facto corpus,web search statistical data,evolutionary algorithms,genetic algorithm,internet,chinese information processing,textual information,natural language processing,statistical segmentation,segmentation standards,text analysis,automatic chinese text segmentation,chinese text segmentation,evolutionary approach,n-best segmentations,probability,pragmatics,genetic algorithms,training data | Scale-space segmentation,Information processing,Evolutionary algorithm,Segmentation,Textual information,Computer science,Segmentation-based object categorization,Text segmentation,Natural language processing,Artificial intelligence,Genetic algorithm | Conference |
Citations | PageRank | References |
1 | 0.36 | 45 |
Authors | ||
1 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dong Zhang | 1 | 3 | 2.07 |