MMToC: A Multimodal Method for Table of Content Creation in Educational Videos - Citegraph

Paper Info

Title
MMToC: A Multimodal Method for Table of Content Creation in Educational Videos

Abstract
In this paper we propose a multimodal method called MMToC for automatically creating a table of content for educational videos. MMToC defines and quantifies word saliency for visual words extracted from the slides and spoken words obtained from the speech transcript. The saliency scores from these two modalities are combined to obtain a ranked list of salient words. These ranked words along with their saliency scores are used to formulate a topic segmentation cost function. The cost function is optimized using a dynamic program framework to obtain the topic segments of the video. These segments are labelled with their corresponding topic names for creating the table of content. We perform experiments on 24 hours of lectures spread across 23 videos ranging over 20-75 minutes duration each. We compare the proposed method with LDA-based video segmentation approaches and show that the proposed MMToC method is significantly better (F-score improvement of 0.19 and 0.24 on two datasets). We also perform a user study to demonstrate the effectiveness of MMToC for navigating educational videos.

Year	DOI	Venue
2015	10.1145/2733373.2806253	ACM Multimedia
Keywords	Field	DocType
Multimodal,table of content,educational videos,visual saliency,text saliency,temporal segmentation,dynamic program	Modalities,Salience (neuroscience),Computer science,Table of contents,Ranging,Artificial intelligence,Computer vision,Ranking,Segmentation,Speech recognition,Multimedia,Salient,Visual Word	Conference
Citations	PageRank	References
8	0.62	25
Authors
3

Authors (3 rows)

Cited by (8 rows)

References (25 rows)

Name	Order	Citations	PageRank
Arijit Biswas	1	747	38.43
Ankit Gandhi	2	16	2.92
Om Deshmukh	3	56	10.55

1