Title
Video and audio based detection of filled hesitation pauses in classroom lectures
Abstract
In this paper we study the detection of hesitation filled pauses in oral presentations of university lectures taught in the Greek language and recorded using a tablet PC via a specialized software. We suggest an hierarchical approach fusing video data with audio data for increasing the precision rate in our detection system. The detection method works at frame level rather than the usual segmental level for more accurate syn- chronization of audio and video data after removing the de- tected hesitations. Audio characteristics are modeled us- ing Gaussian Mixture Models while the stationarity of the recorded video is taken into account. This efficient video and audio combination yields higher precision and recall rates comparing with other works in the literature. On a dataset of approximately 7 hours the precision rate is 99.6% while the recall rate is 84.7% when audio and video data are taken into account.
Year
DOI
Venue
2009
10.5281/zenodo.41616
EUSIPCO
Keywords
Field
DocType
gaussian processes,audio signal processing,educational computing,educational institutions,mixture models,sensor fusion,speech processing,video signal processing,gaussian mixture models,greek language,audio based detection,audio characteristics modeling,audio data synchronization,classroom lectures,hesitation filled pause detection,hierarchical approach,oral presentations,specialized software,tablet pc,university lectures,video based detection,video data fusion,video data synchronization
Video processing,Synchronization,Markov process,Recall rate,Computer science,Precision and recall,Speech recognition,Software,Mixture model
Conference
ISBN
Citations 
PageRank 
978-161-7388-76-7
2
0.51
References 
Authors
7
3
Name
Order
Citations
PageRank
Vassilis Tsiaras1276.75
Panagiotakis, C.21728.24
Yannis Stylianou31436140.45