Abstract | ||
---|---|---|
Conversational agents are gaining popularity with the increasing ubiquity of smart devices. However, training agents in a data driven manner is challenging due to a lack of suitable corpora. This paper presents a novel method for gathering topical, unstructured conversational data in an efficient way: self-dialogues through crowd-sourcing. Alongside this paper, we include a corpus of 3.6 million words across 23 topics. We argue the utility of the corpus by comparing self-dialogues with standard two-party conversations as well as data from other corpora. |
Year | Venue | Field |
---|---|---|
2018 | arXiv: Computation and Language | World Wide Web,Data-driven,Reflexive pronoun,Computer science,Popularity,Natural language processing,Artificial intelligence |
DocType | Volume | Citations |
Journal | abs/1809.06641 | 1 |
PageRank | References | Authors |
0.35 | 5 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Joachim Fainberg | 1 | 3 | 0.73 |
ben krause | 2 | 46 | 4.53 |
Mihai Dobre | 3 | 3 | 0.73 |
Marco Damonte | 4 | 27 | 4.26 |
Emmanuel Kahembwe | 5 | 3 | 0.73 |
Daniel Duma | 6 | 28 | 3.48 |
Bonnie Lynn Webber | 7 | 1511 | 317.14 |
Federico Fancellu | 8 | 7 | 4.18 |