Abstract | ||
---|---|---|
We propose a cross-media lecture-on-demand system, called lodem, which searches a lecture video for specific segments in response to a text query. We utilize the benefits of text, audio, and video data corresponding to a single lecture. lodem extracts the audio track from a target lecture video, generates a transcription by large-vocabulary continuous speech recognition, and produces a text index. A user can formulate text queries using the textbook related to the target lecture and can selectively view specific video segments by submitting those queries. Experimental results showed that by adapting speech recognition to the lecturer and the topic of the target lecture, the recognition accuracy was increased and consequently the retrieval accuracy was comparable with that obtained by human transcription. lodem is implemented as a client–server system on the Web to facilitate e-learning. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1016/j.specom.2005.08.006 | Speech Communication |
Keywords | DocType | Volume |
Cross-media retrieval,Speech recognition,Spoken document retrieval,Adaptation,Lecture video | Journal | 48 |
Issue | ISSN | Citations |
5 | 0167-6393 | 4 |
PageRank | References | Authors |
0.49 | 24 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Atsushi Fujii | 1 | 486 | 59.25 |
Katunobu Itou | 2 | 319 | 44.36 |
Tetsuya Ishikawa | 3 | 226 | 30.46 |