Title
Toward human-assisted lexical unit discovery without text resources
Abstract
This work addresses lexical unit discovery for languages without (usable) written resources. Previous work has addressed this problem using entirely unsupervised methodologies. Our approach in contrast investigates the use of linguistic and speaker knowledge which are often available even if text resources are not. We create a framework that benefits from such resources, not assuming orthographic representations and avoiding generation of word-level transcriptions. We adapt a universal phone recognizer to the target language and use it to convert audio into a searchable phone string for lexical unit discovery via fuzzy sub-string matching. Linguistic knowledge is used to constrain phone recognition output and to constrain lexical unit discovery on the phone recognizer output.
Year
DOI
Venue
2016
10.1109/SLT.2016.7846246
2016 IEEE Spoken Language Technology Workshop (SLT)
Keywords
Field
DocType
lexical discovery,low resource languages,automatic speech recognition
USable,Data modeling,Transcription (linguistics),Pragmatics,Computer science,Lexical item,Fuzzy logic,Speech recognition,Phone,Artificial intelligence,Natural language processing
Conference
ISSN
ISBN
Citations 
2639-5479
978-1-5090-4904-2
0
PageRank 
References 
Authors
0.34
6
8
Name
Order
Citations
PageRank
Chris Bartels100.34
Wen Wang210611.93
Vikramjit Mitra329924.83
Colleen Richey411810.91
Andreas Kathol56811.86
Dimitra Vergyri637336.97
Harry Bratt715315.12
Chiachi Hung800.34