New word identification in social network text based on time series information - Citegraph

Paper Info

Title
New word identification in social network text based on time series information

Abstract
Different from the languages widely used in western countries such as English or French, there are no spaces between words in Chinese language, and a segmentation of the texts is necessary before other superior processes. New word identification is an important problem in the segmentation process, especially when the segmentation targets are social network texts which have more abbreviated words or other non-standard representations. Several methods have been proposed to detect Chinese new words. Most of these methods take the corpus as a static set and they don't consider the time domain information. Different from these studies, we regard our social network corpus as a text series spreading along the time line and design a new kind of features named dynamic features which can reflect the temporal variety of the string's statistical features. The experimental results on the dataset crawled from the biggest microblogging application in China show that this method can significantly improve the effect of Chinese new word identification.

Year	DOI	Venue
2014	10.1109/CSCWD.2014.6846904	CSCWD
Keywords	Field	DocType
time domain,western countries,microblogging application,social network text,social network,string statistical features,time series information,social network corpus,segmentation process,chinese language,chinese new words,english,internet,new word identification,segmentation targets,time domain information,natural language processing,social networking (online),text analysis,french,time series,text series,entropy,feature extraction,vectors	Time domain,Social media,Social network,Segmentation,Computer science,Microblogging,Artificial intelligence,Natural language processing,Time line	Conference
Citations	PageRank	References
0	0.34	7
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (7 rows)

Name	Order	Citations	PageRank
Meng Wang	1	4	1.41
Lanfen Lin	2	78	24.70
Feng Wang	3	20	2.34

1