Abstract | ||
---|---|---|
The paper contains a description of the Spoken Language Corpus of Swedish at the Department of Linguistics, Göteborg University
(GSLC), and a summary of the various types of analysis and tools that have been developed for work on this corpus. Work on
the corpus was started in the late 1970:s. It is incrementally growing and presently consists of 1.3 million words from about
25 different social activities. The corpus was initiated to meet a growing interest in naturalistic spoken language data.
It is based on the fact that spoken language varies considerably in different social activities with regard to pronunciation,
vocabulary, grammar and communicative functions. The goal of the corpus is to include spoken language from as many social
activities as possible to get a more complete understanding of the role of language and communication in human social life.
This type of spoken language corpus is still fairly unique even for English, since many spoken language corpora (certainly
for Swedish) have been collected for special purposes, like speech recognition, phonetics, dialectal variation or interaction
with a computerized dialog system in a very narrow domain, e.g. MapTask (Isard and Carletta 1995), TRAINS (Heeman and Allen 1994), Waxholm (Blomberg et al. 1993). In table 1.1, we compare GSLC to some other corpora. The table provides a comparison of corpora with regard to language, activity types,
dialects, type of interaction, total duration, number of recordings, number of transcribed words, the purpose of the corpus,
chosen transcription format, age of the participants, medium (audio or video) and some other features.
Table 1.1 Comparison of spoken langauge corpora
|
Year | DOI | Venue |
---|---|---|
2001 | 10.3115/1118078.1118079 | SIGDIAL Workshop |
Keywords | DocType | Citations |
teborg corpus,lund corpus,different social activity,spoken language corpus,human social life,language data,social activity,spoken new zealand english,english corpus,danish bysoc corpus,language corpus | Conference | 3 |
PageRank | References | Authors |
0.75 | 1 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jens Allwood | 1 | 284 | 26.37 |
Leif Grönqvist | 2 | 17 | 1.90 |
ELISABETH AHLSÉN | 3 | 108 | 11.30 |
Magnus Gunnarsson | 4 | 41 | 2.25 |