ASR for Under-Resourced Languages From Probabilistic Transcription. - Citegraph

Paper Info

Title
ASR for Under-Resourced Languages From Probabilistic Transcription.

Abstract
In many under-resourced languages it is possible to find text, and it is possible to find speech, but transcribed speech suitable for training automatic speech recognition ASR is unavailable. In the absence of native transcripts, this paper proposes the use of a probabilistic transcript: A probability mass function over possible phonetic transcripts of the waveform. Three sources of probabilistic transcripts are demonstrated. First, self-training is a well-established semisupervised learning technique, in which a cross-lingual ASR first labels unlabeled speech, and is then adapted using the same labels. Second, mismatched crowdsourcing is a recent technique in which nonspeakers of the language are asked to write what they hear, and their nonsense transcripts are decoded using noisy channel models of second-language speech perception. Third, EEG distribution coding is a new technique in which nonspeakers of the language listen to it, and their electrocortical response signals are interpreted to indicate probabilities. ASR was trained in four languages without native transcripts. Adaptation using mismatched crowdsourcing significantly outperformed self-training, and both significantly outperformed a cross-lingual baseline. Both EEG distribution coding and text-derived phone language models were shown to improve the quality of probabilistic transcripts derived from mismatched crowdsourcing.

Year	DOI	Venue
2017	10.1109/TASLP.2016.2621659	IEEE/ACM Trans. Audio, Speech & Language Processing
Keywords	Field	DocType
Speech,Electroencephalography,Probabilistic logic,Crowdsourcing,Brain models,Artificial neural networks	Probability mass function,Nonsense,Crowdsourcing,Computer science,Speech recognition,Coding (social sciences),Phone,Artificial intelligence,Natural language processing,Probabilistic logic,Speech perception,Language model	Journal
Volume	Issue	ISSN
25	1	2329-9290
Citations	PageRank	References
5	0.51	26
Authors
16

Authors (16 rows)

Cited by (5 rows)

References (26 rows)

Name	Order	Citations	PageRank
Mark Hasegawa-Johnson	1	1189	112.85
Preethi Jyothi	2	57	7.85
Daniel McCloy	3	5	0.51
Majid Mirbagheri	4	6	1.87
Giovanni M. di Liberto	5	24	3.71
Amit Kumar Das	6	190	30.00
Bradley Ekin	7	5	0.85
Chunxi Liu	8	23	3.28
Vimal Manohar	9	54	7.99
Hao Tang	10	43	5.30
Edmund C. Lalor	11	68	12.39
Nancy F. Chen	12	120	28.98
Paul Hager	13	5	0.51
Tyler Kekona	14	5	0.51
Rose Sloan	15	5	0.51
Adrian K. C. Lee	16	5	0.51

1