Title
Large scale data based linguistic investigations using speech technology tools: The case of Romanian
Abstract
This paper provides a summary of previous efforts made to build an ASR system for Romanian. Thereafter, the data developed within the ASR framework are used to conduct linguistic studies. A first study is dedicated to morpho-phonetic processes in Romanian such as the deletion of masculine definite article -l and the realization of the word final palatalized consonants as plural marker in nouns and person marker in verb conjugation. Data shows that the two phenomena are variable in continuous speech and depend on the degree of spontaneity of the corpus. The second study is dedicated to Romanian vowels acoustic properties. This study takes into account a 7 hours corpus used as development and evaluation data to build the ASR system. Data confirm a seven-vowel system. They also highlight an acoustic proximity and a complementary distribution of the non low central vowels [] and [Λ]. The current findings support previous hypotheses built from laboratory data investigations and encourage further explorations on large scale data.
Year
DOI
Venue
2015
10.1109/SPED.2015.7343108
2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)
Keywords
DocType
Citations 
large scale data based linguistic,speech technology tool,ASR system,morpho-phonetic process,Romanian vowels acoustic property
Conference
0
PageRank 
References 
Authors
0.34
9
3
Name
Order
Citations
PageRank
Ioana Vasilescu16416.01
Camille Dutrey2173.06
L. Lamel32135361.63