Title
Frequency, collocation, and statistical modeling of lexical items: a case study of temporal expressions in an elderly speaker corpus
Abstract
This study examines how different dimensions of corpus frequency data may affect the outcome of statistical modeling of lexical items. The corpus used in our analysis is an elderly speaker corpus in its early development, and the target words are temporal expressions, which might reveal how the speech produced by the elderly is organized. We conduct divisive hierarchical clustering based on two different dimensions of corpus data, namely raw frequency distribution and collocation-based vectors. Results show when different dimensions of data were used as the input, the target terms were indeed clustered in different ways. Analyses based on frequency distributions and collocational patterns are distinct from each other. Specifically, statistically-based collocational analysis produces more distinct clustering results that differentiate temporal terms more delicately than do the ones based on raw frequency.
Year
Venue
Keywords
2011
ROCLING
different way,raw frequency,elderly speaker corpus,temporal expression,corpus frequency data,case study,different dimension,lexical item,statistical modeling,differentiate temporal term,corpus data,frequency distribution,raw frequency distribution,collocational pattern,gerontology,collocation,corpus linguistics,clustering
Field
DocType
Citations 
Frequency distribution,Computer science,Natural language processing,Corpus linguistics,Artificial intelligence,Cluster analysis,Collocation,Hierarchical clustering,Pattern recognition,Lexical item,Speech recognition,Temporal expressions,Statistical model
Conference
0
PageRank 
References 
Authors
0.34
7
5
Name
Order
Citations
PageRank
Sheng-Fu Wang101.35
Jing-Chen Yang201.01
Yu-Yun Chang303.38
Yu-Wen Liu400.68
Shu-kai Hsieh54721.47