Balanced Corpus Of Informal Spoken Czech: Compilation, Design And Findings - Citegraph

Paper Info

Title
Balanced Corpus Of Informal Spoken Czech: Compilation, Design And Findings

Abstract
The paper presents ORAL2008, a new I-million corpus of spoken Czech compiled within the framework of the Czech National Corpus project. ORAL2008 is designed as a representation of authentic spoken language used in informal situations and it is balanced in the main sociolinguistic categories of speakers. The paper concentrates also on the data collection, its broad coverage and the transcription system that registers variability of spoken Czech. Possible findings based on the provided data arc finally outlined.

Year	Venue	Keywords
2009	INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5	spoken language, transcription, corpus design, sociolinguistic representativeness
Field	DocType	Citations
Czech,Computer science,Speech recognition,Natural language processing,Artificial intelligence	Conference	2
PageRank	References	Authors
0.64	1	3

Authors (3 rows)

Cited by (2 rows)

References (1 rows)

Name	Order	Citations	PageRank
Martina Waclawicová	1	3	1.12
Michal Kren	2	7	2.31
Lucie Válková	3	3	1.12

1