Mining Spoken Dialogue Corpora for System Evaluation and Modeling - Citegraph

Paper Info

Title
Mining Spoken Dialogue Corpora for System Evaluation and Modeling

Abstract
We are interested in the problem of modeling and evaluating spoken language systems in the context of human-machine dialogs. Spoken di- alog corpora allow for a multidimensional anal- ysis of speech recognition and language under- standing models of dialog systems. Therefore language models can be directly trained based either on the dialog history or its equivalence class (or cluster). In this paper we propose an algorithm to mine dialog traces which exhibit similar patterns and are identied by the same class. For this purpose we apply data clustering methods to large human-machine spoken dia- logue corpora. The resulting clusters can be used for system evaluation and language mod- eling. By clustering dialog traces we expect to learn about the behavior of the system with re- gards to not only the automation rate but the nature of the interaction (e.g. easy vs dicult dialogs). The equivalence classes can also be used in order to automatically adapt the lan- guage model, the understanding module and the dialogue strategy to better t the kind of in- teraction detected. This paper investigates dif- ferent ways for encoding dialogues into multi- dimensional structures and dieren t clustering methods. Preliminary results are given for clus- ter interpretation and dynamic model adapta- tion using the clusters obtained.

Year	Venue	Keywords
2004	EMNLP	language model,speech recognition,data clustering
Field	DocType	Volume
Computer science,System evaluation,Speech recognition,Natural language processing,Artificial intelligence	Conference	W04-32
Citations	PageRank	References
7	0.50	4
Authors
3

Authors (3 rows)

Cited by (7 rows)

References (4 rows)

Name	Order	Citations	PageRank
Frederic Bechet	1	14	1.78
Giuseppe Riccardi	2	1046	101.15
Dilek Hakkani-Tür	3	282	17.30

1