Abstract | ||
---|---|---|
We present a novel. data-driven approach to assessing mutual similarities and differences among a group of languages, based on purely prosodic characteristics, namely f(0) and energy envelope signals. These signals are decomposed using continuous wavelet transform; the components represent f(0) and energy patterns on three levels of prosodic hierarchy roughly corresponding to syllables, words and phrases. Unigram language models with states derived from a combination of Delta-features obtained from these components are trained and compared using a mutual perplexity measure. In this pilot study we apply this approach to a small corpus of spoken material from seven languages (Estonian, Finnish, Hungarian, German, Swedish, Russian and Slovak) with a rich history of mutual language contacts. We present similarity trees (dendrograms) derived from the models using the hierarchically decomposed prosodic signals separately as well as combined, and compare them with patterns obtained from non-decomposed signals. We show that (1) plausible similarity patterns, reflecting language family relationships and the known contact history can be obtained even from a relatively small data set, and (2) the hierarchical decomposition approach using both f(0) and energy provides the most comprehensive results. |
Year | DOI | Venue |
---|---|---|
2017 | 10.21437/Interspeech.2017-1044 | 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION |
Keywords | Field | DocType |
language comparison, prosodic typology, wavelet transform, statistical modelling | Computer science,Speech recognition | Conference |
ISSN | Citations | PageRank |
2308-457X | 0 | 0.34 |
References | Authors | |
3 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Juraj Simko | 1 | 35 | 8.20 |
Antti Suni | 2 | 87 | 9.42 |
Katri Hiovain | 3 | 0 | 0.34 |
Martti Vainio | 4 | 209 | 25.72 |