Abstract | ||
---|---|---|
We introduce in this Demonstration a system called Trivia Masster that generates a very large Database of facts in a variety of topics, and uses it for question answering. The facts are collected from human users (the "crowd"); the system motivates users to contribute to the Database by using a Trivia Game, where users gain points based on their contribution. A key challenge here is to provide a suitable Data Cleaning mechanism that allows to identify which of the facts (answers to Trivia questions) submitted by users are indeed correct / reliable, and consequently how many points to grant users, how to answer questions based on the collected data, and which questions to present to the Trivia players, in order to improve the data quality. As no existing single Data Cleaning technique provides a satisfactory solution to this challenge, we propose here a novel approach, based on a declarative framework for defining recursive and probabilistic Data Cleaning rules. Our solution employs an algorithm that is based on Markov Chain Monte Carlo Algorithms. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/ICDE.2011.5767941 | ICDE |
Keywords | Field | DocType |
markov chain monte carlo,probabilistic data,trivia masster,trivia question,trivia game,key challenge,suitable data,trivia player,data quality,users gain point,existing single data,question answering,probability,games,probabilistic logic,markov processes,markov process,reliability,databases,data analysis,monte carlo method,monte carlo methods,very large database | Data mining,Monte Carlo method,Markov process,Question answering,Data quality,Markov chain Monte Carlo,Computer science,Very large database,Theoretical computer science,Probabilistic logic,Database,Recursion | Conference |
ISSN | Citations | PageRank |
1084-4627 | 4 | 0.48 |
References | Authors | |
8 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Daniel Deutch | 1 | 345 | 41.49 |
Ohad Greenshpan | 2 | 195 | 13.43 |
Boris Kostenko | 3 | 13 | 1.43 |
Tova Milo | 4 | 4074 | 1052.72 |